freenode/#shirakumo - IRC Chatlog
Search
20:16:06
SAL9000
right... but if the directory-depot is inside a zip file, you've lost zero-copy with respect to the backing store already...
20:16:32
Shinmera
you wouldn't have a directory-depot in a zip file, it'd have to be a zip-directory-depot.
20:18:07
Shinmera
I think what I'm seeing is that stuff that's not compressed can nest arbitrarily quite naturally, and stuff that is will "automatically" flatten the hierarchy back to an in-memory sequence anyway and continue from there.
20:19:20
SAL9000
I think there's some compressions where you can start at a keyframe-like thing and go from there
20:20:01
SAL9000
also some archive types (zip?) don't force you to decompress all files, just the files you want to read... right?
20:20:13
Shinmera
also I think it would be worthwhile to do the change-class/conversion as a separate method that's automatically invoked as soon as you try to open the entry but no earlier, to avoid stuff like listing entries opening tons of zip files.
20:20:55
Shinmera
When I say "decompress" here I don't mean when you open the zip depot, but when you access another zip in a zip.
20:21:34
SAL9000
right, which would (unless you go crazy with the layer-breaking) force you to decompress the whole 2nd-level zip file from the 1st-level one
20:23:14
SAL9000
yes, but it's still a zip-in-zip and "default" behaviour still "decompresses" it into a sequence
20:23:46
Shinmera
well the zip backend can make use of the compression attribute to figure out what to do.
20:24:15
Shinmera
or actually, requesting an entry from the depot will automatically figure out what to do
20:25:25
SAL9000
child tries to do zero-copy read, if parent is "store" it works, if parent is compressed it "fails" by making a new sequence...
20:26:43
SAL9000
parent constructs new sequence, stores in a cache slot, which is then used to serve future zero-copy calls.
20:26:58
Shinmera
zip1 itself can't be compressed since it has to be decompressed to see what it contains. zip2 in zip1 can be compressed and will only decompress if needed.
20:28:13
SAL9000
"fails" in that it ends up having to do more work -- decompress the whole zip -- than the child may immediately need
20:29:29
SAL9000
if zip2 (child) is not compressed, zip1 (parent) can read direct from backing store just those bytes that child wants
20:30:45
Shinmera
zip1 would not do anything. zip2 is an entry that contains a stream and an offset in that case.
20:31:43
SAL9000
in your draft protocol, WRITE-TO/READ-FROM start/end parameters are referring to the sequence, not the entry, right?
20:34:53
Shinmera
this entry/depot knows it's coming from a directory-depot and as such internally has a file-stream to do its stuff with
20:35:54
Shinmera
when you request a list of entries, it parses the zip file structure and generates zip-entry-depots, each of which know they're coming from a zip file, so they have the corresponding zip entry structure in them.
20:35:54
SAL9000
so... you end up with a zip-file-depot-from-directory-depot, zip-file-depot-from-zip-file-depot, etc.?
20:37:23
Shinmera
so that entry is a zip-file-depot with a file-stream that starts at a certain position
20:37:55
SAL9000
Maybe I'm looking at it wrong, but I'm seeing that file-stream as a (severe) layering violation
20:38:43
SAL9000
Why does the child zip-file-depot need to know that it comes from a real file-system?
20:41:30
Shinmera
in your story the depot now also needs to have a table to store where entries start/end or whatever rather than that info and whatever else just more naturally being part of the entry.
20:42:30
Shinmera
well now you're forcing every entry to have a start/end index even when there's no need to have one
20:43:31
Shinmera
my point is, why not just store all the info that's related to how to get the entry data in the entry itself
20:45:40
Shinmera
yes because in the above scenario we have depots that are connected to file-streams.
20:45:56
SAL9000
so you either need to duplicate the code -- file-stream version vs sequence version
20:46:32
Shinmera
you don't want to do the latter because it's slow and would just lead to more copying.
20:46:55
Shinmera
having two variants is not much more effort and is far more efficient because now you can create sub-entries that just address sub-sequences.
20:47:31
SAL9000
...would it make sense to provide something that quacks like a sequence but is backed by a file-stream?
20:48:28
Shinmera
in zippy I have an "io" structure that has its own API and has two implementations for vectors or for streams.
20:49:08
Colleen
github.com/Shinmera/zippy/b... Website (HTML), Title: zippy/io.lisp at master · Shinmera/zippy · GitHub
20:51:31
Shinmera
it's still not "very fast" because each individual call of these functions will dispatch, rather than dispatching once and then the rest knows.
20:52:07
Shinmera
the defmethod is because size is used elsewhere and I was too lazy to separate them.
20:53:16
SAL9000
...right, you still don't win unless you have two (identical) copies of the body of the caller, under an etypecase
20:54:19
Shinmera
having some magic that eliminates further dispatchers once it knows the type would be nice.
20:56:27
Shinmera
but usually emitting typecase is "good enough" and while slow at compiling, SBCL should eliminate the branching if it knows the variable type.