Parser error when modifying the monsters file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Pondlife
    Apprentice
    • Mar 2010
    • 78

    #16
    It looks like the BOM (byte order mark) at the start of the text stream is causing the problem.

    The first few lines of the original monster.txt from Angband 4.2 looks like this:

    Code:
    00000000: 2320 4669 6c65 3a20 6d6f 6e73 7465 722e  # File: monster.
    00000010: 7478 740a 0a0a 2320 5468 6973 2066 696c  txt...# This fil
    00000020: 6520 6973 2075 7365 6420 746f 2069 6e69  e is used to ini
    00000030: 7469 616c 697a 6520 7468 6520 226d 6f6e  tialize the "mon
    00000040: 7374 6572 2072 6163 6522 2069 6e66 6f72  ster race" infor
    00000050: 6d61 7469 6f6e 2066 6f72 2041 6e67 6261  mation for Angba
    00000060: 6e64 2e0a 0a23 2044 6f20 6e6f 7420 6d6f  nd...# Do not mo

    Your monster.txt file in the zip file you posted looks like this. Note the 0xefbbbf at the beginning - this is the BOM encoded as UTF-8:

    Code:
    00000000: efbb bf23 2046 696c 653a 206d 6f6e 7374  ...# File: monst
    00000010: 6572 2e74 7874 0a0a 0a23 2054 6869 7320  er.txt...# This
    00000020: 6669 6c65 2069 7320 7573 6564 2074 6f20  file is used to
    00000030: 696e 6974 6961 6c69 7a65 2074 6865 2022  initialize the "
    00000040: 6d6f 6e73 7465 7220 7261 6365 2220 696e  monster race" in
    00000050: 666f 726d 6174 696f 6e20 666f 7220 416e  formation for An
    00000060: 6762 616e 642e 0a0a 2320 446f 206e 6f74  gband...# Do not
    Playing roguelikes on and off since 1984.
    rogue, hack, moria, nethack, angband & zangband.

    Comment

    • Adam
      Adept
      • Feb 2016
      • 194

      #17
      What you need to do is simply open the file with notepad++ and Format/Encode in ANSI then save. After that it should work.

      Comment

      • Glorfindel
        Apprentice
        • Mar 2019
        • 60

        #18
        It worked! And contrary to my experience so far, getting over this difficulty did not simply reveal a new error message! Thank you very much!

        [edit] Not to say that your response to the problem I started the thread for was unhelpful. I just had a long series of failures when trying things on my own.

        Comment

        • Gwarl
          Administrator
          • Jan 2017
          • 986

          #19
          I'm actually impressed by the response here, seems like a fairly obscure problem that would be easy to miss for a novice. It's not a problem I've encountered and it's good that we now have an answer ready for anyone to see.

          Comment

          • Pete Mack
            Prophet
            • Apr 2007
            • 6697

            #20
            Pretty common actually--I was guessing BOM too, but according to the notepad++ docs, the default encoding is supposed to apply only to new docs, not existing ones. In retrospect, I suspect what happened is that 'save as' is interpreted as a new doc.

            Comment

            • Gauss
              Apprentice
              • Aug 2018
              • 91

              #21
              Originally posted by Pete Mack
              Pretty common actually--I was guessing BOM too, but according to the notepad++ docs, the default encoding is supposed to apply only to new docs, not existing ones. In retrospect, I suspect what happened is that 'save as' is interpreted as a new doc.
              Notepad normally uses ANSI encoding, so if it reads the file as UTF-8 then it has to guess the encoding based on the data in the file. If you save a file as UTF-8, Notepad will put the BOM (byte order mark) EF BB BF at the beginning of the file. Notepad makes an educated guess i think.

              Comment

              • wobbly
                Prophet
                • May 2012
                • 2575

                #22
                Nice, glad someone who knows more about these things got there first.

                Originally posted by Pete Mack
                Pretty common actually--I was guessing BOM too, but according to the notepad++ docs, the default encoding is supposed to apply only to new docs, not existing ones. In retrospect, I suspect what happened is that 'save as' is interpreted as a new doc.
                In preferences, new document, encoding. the default is UTF-8 with a little tick box "Apply to open ANSI files" defaulting to on. Presumably if you hit save as it will save it in UTF-8.

                Comment

                • Tibarius
                  Swordsman
                  • Jun 2011
                  • 426

                  #23
                  some more information for users with windows operating systems

                  Some more details eventually interesting for window users:
                  An unzipped Angband text file in /lib/gamedata (version 4.1.3)
                  has a disc size of 4096 or a multiple, so the operating system used to create the zip file most likely uses a block write / read disc driver functionality (because it is much faster than reading/writing single chars).

                  Example: p_race.txt has a size of 4621 bytes, and a disc size of 8192 bytes

                  The text files have no Byte order Mark, they are encoded in the ANSI text format. So if you want to modify them, make sure you save with ANSI encoding.

                  Wordpad: does that automatically
                  Editor: you can choose the type (ANSI / unicode / UTF-8) if you Save as
                  Notepad++: requires to be set to ANSI (even if you open an ANSI file with notepad ++ the predefined save option is UTF-8 encoding)

                  The Editor has some kind of problems saving monster.txt in ANSI encoding, afterwards Angband comes up with an parse error. The size of the file changed, but i didn't bother to analyze what exactly changed.

                  Also noteable ... the original files use Hex '0A' (decimal 10) as Newline character. So opening with an Editor that does not add a '0D' carriage return to it (for example the simple Editor) will render the file almost unreadable, because the number of chars per line is 1024 in that case.

                  Notepad++ can handle this format and shows readable lines. And even if you save it in ANSI format, there are still only 0As as end of line characters.

                  Wordpad does add an hex '0D' character at the end of each line. In the case of p_race.txt the file size increases to 4879 bytes.

                  Angband application can parse both 0A alone as well as 0D 0A.

                  Conclusion: Best editor of those 3 for me is notepad++, does not change file size by addind 0D chars, has line numbers and you can choose which encoding to use if you save the file. The only drawback, it always uses UTF-8 as predefined encoding, even if you open an ANSI encoded file.
                  Blondes are more fun!

                  Comment

                  Working...
                  😀
                  😂
                  🥰
                  😘
                  🤢
                  😎
                  😞
                  😡
                  👍
                  👎