Pyrel dev log, part 6

**Derakon** · July 14, 2013, 03:04

In filesize, not in the speed with which files can be generated. It's the latter that's the problem. I don't really care about the size of the savefile as long as it's not completely absurd.

**AnonymousHero** · July 15, 2013, 03:49

Have you tried profiling? (I'm guessing yes, but it never hurts to ask.)

There might be some easy wins in there.

**Derakon** · July 15, 2013, 04:36

Yes, I did, of course. There's nothing obviously improvable. The save code spends a lot of time on:

* Analyzing the types of data so it can determine if they need to be converted into something JSON-friendly (e.g. object reference -> string).
* Actually encoding a bunch of JSON and writing it out (all time spent in the json module).

The latter step is not a priority, since the process of writing the savefile can be done in the background. It's the process of getting the data ready for writing that needs to be sped up.

Every single datum fed to the serializer has to be analyzed to ensure type safety so that the json module doesn't throw an exception when it tries to deal with an invalid type. I did a basic optimization to that system to do the typechecks in frequency priority (i.e. minimizing the number of calls to isinstance), but I don't know that there's much else to do about that function.

That leaves us with less obvious optimizations and a lot of guessing. For example, I could try modifying the serializer to blindly accept certain serializations as "safe" so they wouldn't need to be double-checked. Any "safe" data would need extra handling on the part of the objects being serialized, though, so it might not be worth the effort. It would also probably screw with object references from elsewhere. For example, if I made the Stats/StatMod system "independently serializable", so that a Stats instance would hand the serializer a dict that it could use to recreate itself, then individual StatMod instances would no longer be visible to the rest of the game; any object tracking those would be out of luck.

I could teach the serializer to replace Cells (i.e. Containers that hold things in a specific map tile) with coordinate pairs, and to reconstruct them at the other end. There's a lot of redundancy in the savefile when dealing with map tiles; all that could potentially be removed.

Of course, I could also re-write the serializer in C -- basically making the appropriate Python API calls for typechecking. This would be a colossal pain though. Cython is supposed to do this kind of thing automatically, but being automated it doesn't always catch everything. Cythonizing the module saves about 1.5s on getting all the data ready (down to about 4s), and no time whatsoever on actually writing it (presumably the json module is already written in C).

**Pete Mack** · July 15, 2013, 06:22

YUK. JSON is a terrible format for efficiency. Just about every project I've worked on has had problems with serialization when there's a conversion between ASCII base-10 arithmetic and native binary numbers. The average penalty is about 1 order of magnitude, so a 2.5 sec cost would be reduced to a 0.25 sec cost with no other changes. When there are function calls for each field, especially if the function calls require a code-type boundary such as java2native or Python2native, the cost may increase to 2 orders of magnitude.

In short: for any reasonable sized serialization, especially when serialization cost affects performance (such as going down a level), a big workaround is necessary.

When I did the J2N conversion, I ended up having to do native serialization (and especially deserialization) of network compressed xml (indistinguishable from I/O JSON.) This probably saved the company 3 man-months of wasted labor....

Bottom line: do not try micro-optimization when macro-optimization is the right solution.

Edit: I was completely unsurprised when Bing infrastructure discovered it was wasting 66% of its compute time on BCD conversion. Human readable serialization is seriously overrated when you are storing a twisty maze of data points, all (slightly) different.

**AnonymousHero** · July 15, 2013, 07:50

@Derakon: Could you please post the profile output? (I'm not at a computer where I can run Pyrel myself right now.)

EDIT#1: I wonder if PyPy might be able to speed this thing up...

EDIT#2: Another little thing that springs to mind: Would it be possible to simply (quickly) clone the whole in-memory object/container tree without any transformation and *then* going through the cleanValue process in a separate thread? (Effectively hiding the latency). See copy.deepcopy(...) in the library reference.

**Derakon** · July 15, 2013, 15:05

Originally posted by AnonymousHero

@Derakon: Could you please post the profile output? (I'm not at a computer where I can run Pyrel myself right now.)

Here's the statprof output:

Code:

  %   cumulative      self          
 time    seconds   seconds  name    
 16.22      1.72      1.72  serializer.py:181:cleanDict
 12.83      1.36      1.36  serializer.py:109:addObject
 12.14      1.29      1.29  encoder.py:347:_iterencode_dict
  7.27      0.77      0.77  encoder.py:429:_iterencode
  6.07      0.64      0.64  serializer.py:271:makeObjectReferenceString
  6.04      0.64      0.64  encoder.py:294:_iterencode_list
  5.66      0.70      0.60  serializer.py:210:cleanValue
  4.42      0.47      0.47  encoder.py:296:_iterencode_list
  3.86      0.41      0.41  encoder.py:307:_iterencode_list
  3.67      0.39      0.39  encoder.py:355:_iterencode_dict
  2.71      0.29      0.29  encoder.py:384:_iterencode_dict
  2.52      0.27      0.27  serializer.py:106:addObject
  2.28      0.24      0.24  encoder.py:381:_iterencode_dict
  1.92      0.20      0.20  encoder.py:330:_iterencode_list
  1.73      0.18      0.18  serializer.py:192:cleanDict
  1.70      0.18      0.18  serializer.py:217:cleanValue
  1.34      0.14      0.14  encoder.py:406:_iterencode_dict
  1.00      0.11      0.11  serializer.py:218:cleanValue
  0.70      0.07      0.07  encoder.py:392:_iterencode_dict
  0.65      0.08      0.07  serializer.py:221:cleanValue
  0.64      0.07      0.07  encoder.py:303:_iterencode_list
  0.60      1.42      0.06  serializer.py:228:cleanValue
  0.48      0.05      0.05  serializer.py:279:getIsObjectReference
  0.45      0.05      0.05  serializer.py:202:cleanValue
  0.34      0.04      0.04  serializer.py:280:getIsObjectReference
  0.23      0.02      0.02  encoder.py:315:_iterencode_list
  0.18      0.02      0.02  serializer.py:112:addObject
  0.15      0.02      0.02  serializer.py:267:makeObjectReferenceString
  0.14      0.01      0.01  serializer.py:201:cleanValue
  0.13      0.01      0.01  encoder.py:295:_iterencode_list
  0.13      0.01      0.01  serializer.py:189:cleanDict
  0.10      0.01      0.01  encoder.py:358:_iterencode_dict
  0.10      0.01      0.01  serializer.py:89:addObject
  0.09      0.04      0.01  serializer.py:127:addObjectData
  0.08      0.01      0.01  serializer.py:134:addObjectData
  0.08      0.01      0.01  encoder.py:348:_iterencode_dict
  0.08      0.01      0.01  serializer.py:129:addObjectData
  0.08      0.01      0.01  serializer.py:182:cleanDict
  0.06      0.01      0.01  serializer.py:124:addObjectData
  0.05      0.01      0.01  encoder.py:403:_iterencode_dict
  0.05      0.01      0.01  serializer.py:96:addObject
  0.05      0.01      0.01  serializer.py:93:addObject
  0.05      5.49      0.01  serializer.py:191:cleanDict
  0.04      0.02      0.00  serializer.py:128:addObjectData
  0.04      0.00      0.00  encoder.py:204:encode
  0.04      0.00      0.00  encoder.py:383:_iterencode_dict
  0.04      5.11      0.00  __init__.py:238:dumps
  0.04      2.13      0.00  serializer.py:183:cleanDict
  0.04      0.00      0.00  encoder.py:391:_iterencode_dict
  0.04      0.00      0.00  serializer.py:278:getIsObjectReference
  0.04      0.00      0.00  encoder.py:359:_iterencode_dict
  0.04      0.04      0.00  serializer.py:125:addObjectData
  0.03      0.00      0.00  serializer.py:168:writeFile
  0.03      4.21      0.00  encoder.py:402:_iterencode_dict
  0.03      0.00      0.00  serializer.py:110:addObject
  0.03     10.63      0.00  user.py:507:execute
  0.03      4.32      0.00  encoder.py:428:_iterencode
  0.03      0.00      0.00  encoder.py:372:_iterencode_dict
  0.03      0.00      0.00  serializer.py:99:addObject
  0.03      0.00      0.00  serializer.py:123:addObjectData
  0.03      0.00      0.00  encoder.py:282:_iterencode_list
  0.03      0.00      0.00  serializer.py:162:writeFile
  0.03      0.00      0.00  serializer.py:156:writeFile
  0.03      4.79      0.00  serializer.py:268:makeObjectReferenceString
  0.03      0.00      0.00  encoder.py:290:_iterencode_list
  0.03      0.03      0.00  serializer.py:205:cleanValue
  0.03      0.00      0.00  encoder.py:312:_iterencode_list
  0.01      0.00      0.00  pyximport.py:243:find_module
  0.01      0.00      0.00  encoder.py:333:_iterencode_list
  0.01      0.00      0.00  encoder.py:314:_iterencode_list
  0.01      0.00      0.00  serializer.py:88:addObject
  0.01      5.49      0.00  serializer.py:111:addObject
  0.01      0.00      0.00  encoder.py:399:_iterencode_dict
  0.01      0.00      0.00  serializer.py:208:cleanValue
  0.01      5.27      0.00  serializer.py:207:cleanValue
  0.01      0.00      0.00  serializer.py:165:writeFile
  0.01      0.00      0.00  encoder.py:306:_iterencode_list
  0.01      0.00      0.00  serializer.py:142:writeFile
  0.01      0.00      0.00  encoder.py:380:_iterencode_dict
  0.01      0.00      0.00  encoder.py:405:_iterencode_dict
  0.01      0.00      0.00  encoder.py:287:_iterencode_list
  0.01      0.00      0.00  serializer.py:180:cleanDict
  0.01      0.00      0.00  serializer.py:264:makeObjectReferenceString
  0.01      0.00      0.00  encoder.py:340:_iterencode_dict
  0.01      0.00      0.00  encoder.py:393:_iterencode_dict
  0.01      0.00      0.00  serializer.py:211:cleanValue
  0.00     10.63      0.00  __init__.py:48:contextualizeAndExecute
  0.00      4.54      0.00  serializer.py:225:cleanValue
  0.00     10.63      0.00  __init__.py:59:init
  0.00      0.05      0.00  serializer.py:136:addObjectData
  0.00     10.63      0.00  pyrel.py:69:<module>
  0.00      0.13      0.00  serializer.py:114:addObject
  0.00     10.63      0.00  mainApp.py:25:makeApp
  0.00      5.10      0.00  encoder.py:203:encode
  0.00      0.04      0.00  encoder.py:326:_iterencode_list
  0.00     10.63      0.00  commandHandler.py:74:receiveKeyInput
  0.00      0.00      0.00  cProfile.py:9:<module>
  0.00     10.63      0.00  mainFrame.py:79:keyPressEvent
  0.00     10.63      0.00  commandHandler.py:79:asyncExecute
  0.00     10.63      0.00  __init__.py:14:init
  0.00      0.62      0.00  serializer.py:161:writeFile
  0.00      4.18      0.00  serializer.py:167:writeFile
  0.00      0.31      0.00  serializer.py:155:writeFile

EDIT#1: I wonder if PyPy might be able to speed this thing up...

Might be worth looking into, but I'd rather not change interpreters if possible because the Python.org 'terp is really the standard one, and I suspect it'd create confusion if we used a different one. Especially since devs are expected to install their own interpreters.

Let me know what the results are if you decide to try this, though.

EDIT#2: Another little thing that springs to mind: Would it be possible to simply (quickly) clone the whole in-memory object/container tree without any transformation and *then* going through the cleanValue process in a separate thread? (Effectively hiding the latency). See copy.deepcopy(...) in the library reference.

I'm not clear on exactly how deepcopy() works. The main question is if object references in the copy point to objects in the copy or to objects in the original. If the latter, then this won't work.

In any event, copying the GameMap with deepcopy() takes 6.7s, so, unlikely to be useful.

Pete: I used JSON because it's the format already being used for datafiles, it's human-readable, and Python has a builtin library for handling it. What would you suggest I use otherwise? And how would that alternative speed things up? I'd still have to examine datatypes to do conversions on function/object references, if nothing else.

**AnonymousHero** · July 15, 2013, 17:20

Originally posted by Derakon

Here's the statprof output:

Code:

(snip)

Thanks. (EDIT: ... but it seems kind of odd that there are no invocation counts? That seems relevant.)

Originally posted by Derakon

I'm not clear on exactly how deepcopy() works. The main question is if object references in the copy point to objects in the copy or to objects in the original. If the latter, then this won't work.

In any event, copying the GameMap with deepcopy() takes 6.7s, so, unlikely to be useful.

It creates copies of everything and updates references (in the copy "output") to point to further copies (hence "deep" rather than "shallow").

Shame that it takes so long, I was kind of hoping that there might be some C code behind it since it's in the standard library. (It seems to be pretty typical of the cPython implementation to simplfy implement slow bits of the standard library in C as an optimization.)

Another last-ditch type of thing: Does it work to do an in-memory pickle+unpickle of the game state in order to copy it before doing the real serialization? I was looking around various blogs and things about the slow performance of deepcopy() and some postings suggested that this would be ~3 times faster than deepcopy()... which should bring it into the sub-4s range at the very least. (I belelieve you'd need to use the cPickle module, not sure whether it needs to be imported explicitly or if Python'll do that for you.)

**Derakon** · July 15, 2013, 17:42

The statprof module does statistical profiling -- it sets interrupts and regularly checks up on which line of code is being executed. It's less invasive, tells you about problematic lines instead of just problematic functions, and doesn't distort the results very much, unlike cProfile, but also unlike cProfile it doesn't get you invocation counts.

I tried using cPickle on the GameMap, but it failed; I got an error "can't pickle instancemethod objects". It may be it doesn't like Cythonized classes? Or some attribute of the GameMap is making it unhappy. Unfortunately it doesn't tell me which attribute was the problem. So this may still be an option, but it'll require some more detailed looking into that I can't do right at the moment.

**AnonymousHero** · July 15, 2013, 18:07

Dang. Shame

.

**Pete Mack** · July 15, 2013, 20:21

Originally posted by Derakon

\Pete: I used JSON because it's the format already being used for datafiles, it's human-readable, and Python has a builtin library for handling it. What would you suggest I use otherwise? And how would that alternative speed things up? I'd still have to examine datatypes to do conversions on function/object references, if nothing else.

I'd use pickle/unpickle, because it doesn't use decimal. Binary-to-Decimal conversion is expensive. From your profiling, it is costing about 50% of the total time.

**Derakon** · July 15, 2013, 21:00

Pickle/unpickle are fundamentally verboten because unpickling a file could potentially do anything. As a result, you cannot trust savefiles created by other players (e.g. for competitions). I discussed this earlier.

Originally posted by The Python documentation on pickle

Warning

The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

What do you mean exactly by "binary to decimal" conversion? Just storing numbers as ASCII instead of binary?

**Pete Mack** · July 16, 2013, 01:48

Yes, I meant: don't use ascii. I can almost guarantee that half the time of serialization is in decimal conversion.

**Derakon** · July 16, 2013, 02:32

Isn't that time all in the JSON encoder, though? I.e. the half that I can just spin off to a separate thread while the player continues their game?

**AnonymousHero** · July 16, 2013, 19:05

Originally posted by Derakon

Isn't that time all in the JSON encoder, though? I.e. the half that I can just spin off to a separate thread while the player continues their game?

Indeed -- I wouldn't worry about it. Copying the whole game state as fast as possible should probably be the focus. (Of course, if the state were immutable you'd already be laughing, but I digress

. Sorry, couldn't resist -- I realize that other things would probably have become considerably more difficult up front.)

EDIT: Actually -- and perhaps a little more constructively -- is it possible to efficiently detect changes in the game state? I was just thinking that perhaps it would make sense to just queue a save, and if the user does something before the save actually happens, you can just start the "save" process from scratch? Perhaps you could even just do automatic saves every 5 seconds (assuming that the user idles "enough" at some points). An alternative to starting from scratch would be to just serialize those bits of game state that have actually changed since the attempt-at-saving. (You'd have some redundant data, but storage space is cheap, so...)

**Derakon** · July 16, 2013, 19:44

To be certain I understand your suggestion, the idea here would be that when the user hits Save, we start recording the changes they make to the game, and then when we get a spare chance (i.e. the user stops feeding in commands that change the game state), we make a copy of the game state and the "back out" the changes they made since they requested the save?

That is an interesting idea. I do eventually want the game to be able to support rewind/replay capabilities, which are most efficiently represented as deltas to the game state. My thoughts for that was to enable spectating of the game -- when you wanted to watch another player, you'd request a copy of their game state, and then they would feed you deltas as game actions are taken. And of course, being able to undo actions makes for a great cheat option.

Right now might be the right time to start thinking about how that will be implemented. It's probably going to be tricky.

Regarding mutability: increasingly, more and more game-state-mutating actions are done by way of the GameMap (i.e. the primary object that holds all of the other objects). If I wanted to make the game immutable, then probably the way to do it would be to funnel all changes through the GameMap, directly or indirectly. Of course, then the trick becomes figuring out how to generate a new GameMap every time the player takes an action without it taking 4 seconds.

Pyrel dev log, part 6

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment