Pyrel dev log, part 6

**AnonymousHero** · June 12, 2013, 20:45

Oh, I see... what you really want is a representation of procs (including all their dependencies and state)... which you basically can't get because of the hiding implied by lambdas.

I'm afraid I cannot offer any sensible advice, then. AFAICT the only solution is to *embed* the proc language such that it's actually a language that's *interpreted* bt the Pyrel game code (as an AST). Of course it could also be compiled to python bytecode or whatever when actually executed, but that's just an optimization...

**Derakon** · June 12, 2013, 21:04

If at all possible, I want to avoid any approach that involves serializing bytecode (or code represented in any other manner, for that matter). As soon as you do that, you can't trust any distributed savefile to not be malicious. For similar reasons I can't use the pickle library in Python to do [de]serialization for me -- pickle allows you to create custom objects with their own [de]serialization routines which can do anything.

At this point, I'm hoping there's something clever that I've missed, but I'm not especially optimistic.

**AnonymousHero** · June 12, 2013, 21:15

Indeed -- off the top of my head, I think the only possibility is embedding and reification (...but I may very well have missed something!).

**Magnate** · June 13, 2013, 18:41

I wonder if a non-coder's perspective might be helpful: could we change the way procs affect state? If, for example, the temporaryStatModProc created a timer that was a member of the Thing itself (thing.temporaryStatModProc1Timer), then it would be trivially serialised along with the Thing. Then all we need is to decode the timer expiry into the relevant proc trigger. This basically moves the problem from serialisation to mapping of procs to timers.

**Derakon** · June 13, 2013, 19:17

You can do that, but it amounts to codifying what effects Procs are allowed to have. Currently they can do anything as long as you're willing to write the code for them (including the code to handle [de]serialization of course).

Codifying allowable Proc effects is roughly equivalent to codifying how Procs are allowed to function internally (c.f. disallowing lambda functions) -- both constrain your options to make the serialization process feasible. However, the latter is more flexible and thus IMO more desirable.

**AnonymousHero** · June 14, 2013, 19:45

Actually, maybe you're looking for something like Applicative Combinators for procs?

**Derakon** · June 14, 2013, 20:24

A quick google doesn't turn up a conclusive definition for that term. Do you have a reference handy that describes it?

**AnonymousHero** · June 15, 2013, 06:12

Sorry, I should learn to be less opaque

.

It's basically a combination of abstractions which lets you build up big computations from many smaller computations in a structured way. These computations can then be combined further, etc. The idea is that the built-up computation is "introspectable" (and thus can easily be made serializable).

The idea comes from functional programming, so I'm not quite sure how well it would translate to Python, but here are a couple of pages which go into a bit more detail with further pointers:

Combinator pattern - HaskellWiki

http://www.haskell.org/haskellwiki/Combinator_pattern

http://learnyouahaskell.com/functors-applicative-functors-and-monoids#applicative-functors

**Derakon** · July 9, 2013, 23:52

Just a quick update: I've finally gotten the save() function to complete in Debug Town without erroring out. I haven't tried load() yet outside of my unit test (which was working much earlier than the Debug Town test).

The savefile is 6.8MB. It takes about 2.5s to generate.

I...might have some optimization work to do.

I'd estimate nearly 2MB of that is given over to massively redundant terrain entries for all of the walls in Debug Town. Between the Terrain instances themselves, their stats, their Procs for tunneling, and their display data, each one takes over a kilobyte, and there's about 1600 of them. A decent aliasing system to handle storage of identical objects would be able to trim the entire terrain set down to probably under 10kB.

Some notes:
* The file compresses to about 600kB using gzip; this is totally reasonable as far as I'm concerned. Python has built-in support for reading and writing gzipped files, though it's of course slower than working with plaintext.
* It takes about .25s to add all of the game objects to the serializer, and about 1.2s to clean things up (replace object/function references, replace tuples with lists, etc.). The rest of the time is spent writing over 7 million characters; this could be done in a different thread. That still leaves us with saving of a frankly rather simple level stopping play for about a second and a half, which isn't acceptable.

Loading, as noted, is as-yet unknown. I still need to hook up the commands before I can start seriously testing it.

**Patashu** · July 10, 2013, 01:27

Can you talk a bit about how you solved the problems you were worried about earlier (related to serializing what is essentially code safely)?

**Derakon** · July 10, 2013, 02:07

I solved that problem by refusing to do it. Lambda functions cannot be serialized. You can serialize references to a function, but only if that function is a method of a class instance that is also being serialized. So for example, player.canSee() is valid, because it's a method of an instance of the Player class, but trying to serialize procs.procLoader.getProc() would be invalid since it's just a bare function.

Here's the bit of code that handles serialization of function references. Of course, a function reference is not a valid JSON type, so instead we generate a string that encodes the necessary information, so we can extract it later.

Code:

    ## Given an input object that is a function, generate a string of the 
    # form
    # "__pyrelFunctionPointer__:object ID:function name".
    def makeFunctionString(self, func, *parents):
        # The im_self field on functions contains the object the function
        # is bound to. 
        obj = func.im_self
        # Ensure the object that the function is bound to will be serialized.
        if obj not in self.objectToId:
            if obj.__class__.__name__ in NAME_TO_DESERIALIZATION_FUNCS:
                self.addObject(obj)
            else:
                raise RuntimeError("Tried to serialize a function reference for an object that is not itself being serialized: %s. Parentage: %s" % (str(obj), str(parents)))
        boundId = obj.id
        # The __name__ field is the function's name in string form.
        funcName = func.__name__
        return "__pyrelFunctionPointer__:%s:%s" % (boundId, funcName)

In the case of Player.canSee(), then, we would generate a string that looks something like "__pyrelFunctionPointer__:182957:canSee", where 182957 is the ID of the particular Player instance. Elsewhere in the code we have the serialization of that Player instance, including its ID. When we deserialize the Player later, we retrieve its ID, and when we then encounter this string, we can say "Ahh, that means calling the canSee() method on object 182957, which is this Player instance."

Make sense?

**Derakon** · July 11, 2013, 22:10

Ha! Save and load both work! And the system is honestly quite elegant, in the sense that the rest of the engine need care very little about exactly how serialization and deserialization are handled. Individual objects have the following requirements:

Must have an 'id' field that is unique across all objects
Must have a getSerializationDict() function that generates a dict version of the object's state (oftentimes this is simply object.__dict__).
Must provide a function that creates a new instance of the object with no data filled in (sometimes this can be simply the object's constructor).
May provide a function to fill in data on a "blank" object created with the above (otherwise, a default function just setattrs everything into place).

If you do all that, then serialization and deserialization will Just Work for most cases. The serializer is handed the GameMap, and from that it is able to track down every object in the game and serialize them all. Likewise, the deserializer is able to construct a new GameMap and populate it and all of the other objects from the savefile, reconstructing object relationships as they were before saving.

Of course, all this generality comes at the expense of performance, as previously described. My earlier numbers were overoptimistic as the entire game state was not being fully-captured; currently we're looking on the order of 10-12 seconds to save the game, and a bit less to load it. An uncompressed save of the town is 14MB; compressed, it is 1MB. Interestingly, a save of a dungeon is only 16MB, so it seems that much of the size of the savefile is relatively invariant with the amount of stuff on the level. The implication here is that finding a way to more compactly represent the Cells (which hold all the stuff actually in the map) could result in big savings.

Things aren't entirely bug-free yet (sometimes the GameMap somehow gets Containers instead of Cells (a sub-type of Container) holding its map contents, for example), but this is major progress...

**Magnate** · July 13, 2013, 12:01

Congrats! Keep up the good work.

**Derakon** · July 13, 2013, 15:49

I've fixed the bug I mentioned earlier; it had to do with having multiple objects with the same ID, which caused confusion when deserializing the game later. This does mean that the previous savefiles were also incomplete (since they only included one of the objects); new saves are 25MB in town (compressing to 1.5MB), take 10.5s to save, and 8.5s to load, on my fairly powerful desktop computer.

Optimizing this down to where saving takes negligible time may not be possible; that's at least two orders of magnitude that need to be optimized away, and I don't think there's really any algorithmic improvements that can be made. That said, if you want to check the code out yourself, it's in the "saveload" branch on my repo. Hit 'S' to save a game (to save.txt), and '!' to load it. Check commands.user.SaveCommand.execute() for the starting point of the save system, and LoadCommand for loading. The util.serializer code has all of the heavy lifting.

**Patashu** · July 14, 2013, 02:26

If it compresses to 1.5MB from 25MB, that implies an order of magnitude improvement can be made, does it not?

Pyrel dev log, part 6

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment