freenode/#sicl - IRC Chatlog

8:37:10 MichaelRaskin If your left recursion is structured simply enough, you can do a pretty trivial trick (that's what I did whenever I needed it)

8:38:07 beach Maybe so, but the paper cited by splittist seems to solve the general problem and again, it is a document well written.

8:38:36 MichaelRaskin Well, I would not consider the result they obtain in the worst case «packrat»

8:39:02 MichaelRaskin (They even say in the abstract that patological cases have superlinear parse time)

8:39:07 beach OK.

8:39:13 MichaelRaskin On the other hand, you do not need pathological cases anyway.

8:39:33 beach It is fine with me to call it something else.

8:40:48 heisig Good morning!

8:40:52 beach Hello heisig.

8:41:21 beach My (admittedly small) family says that an adverb followed by a perfect participle should not make the two tied with a hyphen.

8:45:04 MichaelRaskin beach: I guess my point is moot because you might only need 3.2 from the paper (direct left recursion) which is indeed well written, and clearly still packrat, and equivalent to the simplest thing done for this case.

8:46:03 beach I would rather call it something else and have it handle mutual left recursion.

8:47:14 heisig beach: Where did I tie an adverb to a perfect participle?

8:48:13 beach Oh, I guess your case is adverb + adjective. But the same rule applies.

8:48:25 beach "widely applicable" rather than "widely-applicable".

8:48:32 splittist strongly typed v strongly-typed?

8:48:55 beach Yeah, but it goes for adverb+adjective as well.

8:48:59 heisig Heh, I see. We Germans like to tie words together :)

8:49:46 beach So do Swedes. But then, I am not Swedish anymore, at lest for all practical purposes.

8:49:53 beach *at least

8:53:33 splittist Working on Clime, I feel I'm doing proper modern programming: sending strings over sockets...

8:55:06 beach YAY! A "micro service"!

8:56:46 heisig I just pushed a commit that removes a hyphen.

8:57:31 beach Great! Thanks. I could have done it myself, but, you know, "give a person a fish...".

8:59:26 heisig Sure, the feedback is much appreciated.

8:59:36 beach Whew! :)

9:00:50 heisig Now that our technique with the client arguments is documented, I can finally move on to the type inference stuff.

9:01:04 heisig But first I have to merge some SIMD patches that I received yesterday.

9:01:04 beach Good plan.

9:01:15 heisig It is a very interesting problem.

9:01:34 beach For Petalisp?

9:02:43 beach We also need to convince someone to modify Incless, but I guess I need to do that simultaneously with SICL.

9:03:29 heisig The SIMD patches are not directly tied to Petalisp. But people finally want SIMD that I am receiving patches for my half finished sb-simd library.

9:03:45 beach I see.

9:04:22 heisig Right now, I am re-reading the subtypep paper by Léo Valais.

9:04:45 heisig And maybe I should read Baker's original paper a fourth time.

9:04:46 beach Where they use Baker's technique?

9:05:04 beach Maybe you are better off using Bike's SUBTYPEP.

9:05:26 heisig That, too.

9:06:25 beach I think Bike gave up on Baker's method in favor of canonicalization. Maybe the reason was that Baker's method would be unable to handle the CONS type.

9:07:54 beach Then, if we go back to my musings about the purpose of type inference from the other day, a type like CONS would probably be enough for the purpose of performance.

9:08:46 heisig My plan right now is to have separate generic functions for precise type inference, and for approximate type inference.

9:09:16 beach What is meant by "precise type inference"? It smells undecidable.

9:10:54 heisig There are some cases where subtypep is guaranteed to be precise. And many further cases where it should be. For that to work, I can't just turn (CONS A B) into CONS.

9:11:38 heisig But a compiler will most certainly want approximate type inference, where the result is a type that is guaranteed to include the actual type of the value in question.

9:13:31 beach How do you then handle intervals of (say) integers?

9:14:19 beach SUBTYPEP can handle those, but your type inferencer would have a hard time.

9:16:01 beach (loop with x = 0 until (> (ackermann ...) ...) do (incf x) (finally (return x)))

9:16:34 heisig Exactly! That is one of the things I worry about in the approximate type inference.

9:16:36 beach Not a great example, but I think you get the point.

9:17:28 beach You would have to solve the halting problem in order to get precise types, unless by precise type inference you mean something else.

9:18:43 heisig Now I get it. The problem was that I wrote 'precise type inference', when I actually meant 'precise reasoning about types'. I don't plan to tackle the halting problem.

9:19:03 beach You had me worried there for a while.

9:19:40 beach So then I need to understand (at some point) what "precise reasoning about types" implies.

9:20:23 heisig Your loop example is an interesting problem, because an approximate type inference will have to switch from interval types to non-interval types eventually, or it won't terminate.

9:21:02 heisig So maybe a counter is needed.

9:23:10 heisig Precise: When subtypep returns a second value of T, the first argument is correct.

9:23:36 heisig s/argument/value

9:25:02 beach I can't figure out how you could do type inference without a finite approximation of the type system, and with such an approximation, you obviously can't represent everything.

9:42:03 beach Or to take an example using CONS: (loop with list = '() for i from 100 to 110 do (loop for j from 100 to 110 do (push (if (zerop (mod (ackermann i j) 7)) "yes" 234) list)) (finally (return list)))

9:42:11 beach What is the type of the return value?

9:50:44 beach Anyway, time for a lunch break.

9:51:25 beach heisig: I suggest you tackle "approximate type inference" first.

10:47:24 heisig beach: That one is easy, I already do that for Petalisp.

11:47:31 beach You already do the approximate one?

12:06:36 heisig Yes, I do. (https://github.com/marcoheisig/Petalisp/tree/master/code/type-inference)

12:07:12 heisig It is very fast and doesn't even cons. Obviously, SICL will have slightly different requirements than Petalisp.

12:08:09 beach I suppose. I don't quite see how you can take advantage of it, since you are not in control of the underlying Common Lisp implementation.

12:08:26 heisig Advantage of what?

12:08:35 beach The result of type inference.

12:08:50 beach I suppose you could take advantage of how (say) SBCL optimizes certain declarations etc.

12:10:13 heisig Petalisp uses the result to store results in specialized arrays, when possible. And it uses the type inference to eliminate certain type checks.

12:10:46 beach OK.

12:11:26 heisig But as I said, SICL has different requirement. I assume some amount of consing won't hurt, for example.

12:11:36 heisig *requirements

12:11:55 beach Right, consing won't hurt.

12:12:26 heisig The crucial question is how we intend to infer the value types of calling a particular function.

12:12:43 heisig SBCL uses hand-written type inference functions for that, but that is error prone and a lot of work.

12:13:26 beach Yes, I see. Are you talking about standard functions here?

12:13:56 heisig Right. Standard functions and, possibly, functions that have been declared constant by other means.

12:14:12 beach I see, yes.

12:14:38 heisig I wonder whether Cleavir could generate type inference functions automatically from the AST or HIR of the original function.

12:16:04 beach I think so, yes. I mean, the plan is to do type inference on HIR. That's partly why HIR exists after all. So if we are lucky, it will infer the types of the places that are ultimately returned.

12:16:16 no-defun-allowed So, performing type inference on the function itself?

12:17:13 beach I don't think it is good enough to do type inference on source code if that is what you are asking.

12:17:39 heisig One cool thing about such generated type inference functions would be that they could handle higher-order functions, too.

12:17:44 beach Transformations that are not possible to express in source code will improve the precision of the inferred types.

12:17:48 no-defun-allowed No, I understand we'd rather do HIR.

12:18:14 heisig I wonder if functions like MAP or REDUCE could be properly reasoned about automatically.

12:18:30 no-defun-allowed But then one writes inference rules on HIR instructions. I suppose that could be easier as there are fewer HIR instructions?

12:18:38 beach heisig: I don't know.

12:18:58 beach no-defun-allowed: As opposed to what?

12:20:20 no-defun-allowed As opposed to writing rules for functions instead, as SBCL does.

12:20:33 beach Oh, I see.

12:21:09 beach Perhaps they can obtain better precision with hand-written function types.

12:22:32 beach But, let me remind y'all that there are these two separate purposes of type inference. One is improved performance. The other is better compile-time messages to the programmer.

12:22:56 beach I find it clears things up to start thinking about performance improvements.

12:23:51 beach Because then a number of possible inferred types won't necessarily make any difference, so could be put off until later, or abandoned altogether.

12:24:00 heisig There is also the challenge of not having bugs in the type inference.

12:24:10 beach That too.

12:27:10 beach So to start with for SICL, there are going to be tons of test for standard-object/not-standard-object simply because this test is required in order to determine whether to look for a stamp or not.

12:27:42 beach So lots and lots of discriminating functions will contain some variation of this test.

12:28:23 beach Like test for fixnum, character, cons, single-float.

12:28:35 beach Next, CONS is an important one.

12:28:48 beach So is NULL.

12:29:21 beach Of the remaining ones, I think all ARRAY types will be important too.

12:30:20 heisig But you also want to get (coerce X 'single-float) right, so suddenly you have to worry about symbols and EQL specifiers.

12:31:14 beach You mean for that particular symbol, i.e., SINGLE-FLOAT?

12:31:20 heisig Right.

12:31:53 beach I don't see how tracking that in the type inferencer would help.

12:31:59 beach Can you give me a hint?

12:33:09 beach Oh, I think I see.

12:33:11 heisig If the type inference understands EQL types, it can detect that the second argument to COERCE is the symbol SINGLE-FLOAT and hence the result type has to be single-float.

12:33:19 beach Right.

12:33:55 beach So that would be an example of what the type inferencer would do on the function COERCE.

12:35:48 heisig It would check whether the second argument is known to be a constant. If so, it could turn that constant into a type (or emit a warning if it is not a type specifier).

12:36:01 heisig Then it can return that type as the return type of COERCE.

12:37:02 heisig Ideally, that pass could also switch to a specialized version of COERCE. And in the best case where the first argument to COERCE already has the correct type, the call can be optimized away entirely.

12:37:08 beach Right. So the other thing that we absolutely must take into account is the frequency of occurrence of certain constructs vs the cost of not handling those constructs.

12:38:04 beach I don't think I have a single occurrence of COERCE in any code I have written. But it might of course be used in some arithmetic function for numeric. contagion.

12:40:49 heisig A trick I use occasionally is (coerce sequence 'simple-vector). Because every sane CL compiler will then optimize the element access.

12:42:20 beach Possibly. Except that in SICL, all vectors are simple.

12:42:55 beach And access to the vector elements would very likely be done in a loop, so lots of checks would be handled by loop invariants.

12:43:37 heisig You mean all arrays are simple. Or do we not plan to have specialized vectors?

12:43:53 beach All arrays are simple.

12:44:29 beach Except possibly the displaced ones.

12:45:18 heisig The 'simple' in 'simple-vector' has a different meaning than the 'simple' in 'simple-array'. I implies the element type is T.

12:45:25 beach Oh, sorry.

12:46:01 beach I am saying that, if the elements of SEQUENCE is then accessed in a loop, there will be tests for that in AREF, and those tests will then become loop invariants, so the test will be done once anyway.

12:46:30 heisig ACTION is not entirely convinced.

12:46:33 beach s/is/are/

12:47:22 beach It is possible that I am wrong in some such cases but that's not the point of what I am trying to say right now.

12:48:04 beach The point is that we should not try to optimize cases that don't matter. Either because they don't happen frequently enough, or because optimizing them won't make a difference.

12:48:39 beach And I think it is crucial to think that through before putting work into creating code that will then have to be maintained, even though it serves no purpose.

12:49:18 beach And here "think that through" might involve some profiling to be convinced in some cases.

12:50:07 heisig I get what you are saying. But I am still trying to come up with a solution that is both easy to maintain, correct, and produces high-quality results.

12:50:19 heisig If that fails, I can go for more humble goals.

12:50:43 beach More power to you. And good luck.

13:09:07 beach heisig: So what is your overall plan for the type inferencer?

13:10:00 beach I am asking because, as we have seen in the past, we may need some simple value numbering to infer the types of some variables.

13:14:34 heisig Heh, I was just thinking about value numbering :)

13:15:18 heisig The current plan is to write a HIR transformation for turning functions that operate on values into functions that operate on types.

13:15:33 heisig If that works, we'd get all sorts of inference for free.

13:16:22 heisig We also have to chose a representation for types. This is what's giving me a headache right now.

13:16:28 beach OK.

13:16:39 beach I am not sure I understand that idea, but my understanding can wait until you have something more concrete.

13:17:17 beach The other thing is control flow. We currently don't have a good idea of control flow in more complicated situations. And the type a variable can take on crucially depends on it.

13:17:28 MichaelRaskin heisig: and you will need to run the type operations in both direction?

13:17:53 beach MichaelRaskin: What is the second direction here?

13:19:05 MichaelRaskin If you have an operation taking only type X and producing type Y, you use it both to infer the type of values dependent on the output and to infer what type the original parameters need to be.

13:19:57 beach And what do you do with the last type of information that doesn't violate the semantics of the language?

13:20:34 beach That's what Bakers's type inferencer does, and it does not produce conforming code.

13:21:41 beach heisig: The problem with control flow is shared variables. If you have (f (lambda (y) (setf x y))), then x can be set to anything at any time, because F might create a thread.

13:21:46 MichaelRaskin Ah, datum is required in the type-error

13:22:42 heisig beach: That is an easy case. The approximate type of X is simply T.

13:22:57 heisig The harder cases are those where we could infer something more precise than T.

13:23:01 beach heisig: That would eliminate lots of cases.

13:23:43 beach heisig: Because it might be known that F does NOT create a thread. So then the damage is limited.

13:24:03 beach Say, F = SORT.

13:24:43 beach But the problem is that we need to have as precise a control graph as possible, and I believe Bike gave up on that.

13:25:27 heisig Automatic HIR rewriting should be able to handle cases like SORT.

13:25:50 beach OK.

13:26:06 beach Just pointing out potential issues here.

13:27:18 heisig Time for a coffee break.

15:24:37 beach OK, I think I understand how this-parsing-technology-that-can't-be-called-packrat-parsing-because-it-handles-left-recursion work. Next step is to read scymtym's code and documentation for the library named parse.packrat, whatever technique it may use.

15:52:30 beach Hmm, it looks like the parser.packrat library depends on sb-cltl2.