Utf-8

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Nick
    Vanilla maintainer
    • Apr 2007
    • 9636

    Utf-8

    UTF-8 character handling has now come to the nightlies, as described here.

    Is there any adjustment that has to be made to this site to accommodate that? Pav, do you have any recommendations for how things like screenshots and character dumps should be coded?
    One for the Dark Lord on his dark throne
    In the Land of Mordor where the Shadows lie.
  • pav
    Administrator
    • Apr 2007
    • 793

    #2
    As existing iso-8859-1 encoding is forward compatible with Utf-8, it sounds easiest for everyone if everything is in regular Utf-8.

    Do you have a sample dump for me to play with?
    See the elves and everything! http://angband.oook.cz

    Comment

    • Nick
      Vanilla maintainer
      • Apr 2007
      • 9636

      #3
      I think this one is - check out Nar, Khim etc in the history section.
      One for the Dark Lord on his dark throne
      In the Land of Mordor where the Shadows lie.

      Comment

      • pav
        Administrator
        • Apr 2007
        • 793

        #4
        Aha. Should be fine for the dump content now. Should I expect utf-8 to ever appear in character class, race and name fields? (Also it's funny that this bb software is not utf-8 enabled. When I flipped the charset on it, Timo lost his รค...)
        See the elves and everything! http://angband.oook.cz

        Comment

        • andrewdoull
          Unangband maintainer
          • Apr 2007
          • 872

          #5
          Originally posted by Nick
          UTF-8 character handling has now come to the nightlies, as described here.

          Is there any adjustment that has to be made to this site to accommodate that? Pav, do you have any recommendations for how things like screenshots and character dumps should be coded?
          More importantly, what are the new monster types being added?

          å∑€œ®†¥
          The Roflwtfzomgbbq Quylthulg summons L33t Paladins -more-
          In UnAngband, the level dives you.
          ASCII Dreams: http://roguelikedeveloper.blogspot.com
          Unangband: http://unangband.blogspot.com

          Comment

          • nppangband
            NPPAngband Maintainer
            • Dec 2008
            • 926

            #6
            Originally posted by pav
            Aha. Should be fine for the dump content now. Should I expect utf-8 to ever appear in character class, race and name fields? (Also it's funny that this bb software is not utf-8 enabled. When I flipped the charset on it, Timo lost his รค...)
            Dรบnedain in NPP have been accented in NPP for years. Not sure about Vanilla. I am sure you will be seeing it more and more with utf-8.
            NPPAngband current home page: http://nppangband.bitshepherd.net/
            Source code repository:
            https://github.com/nppangband/NPPAngband_QT
            Downloads:
            https://app.box.com/s/1x7k65ghsmc31usmj329pb8415n1ux57

            Comment

            • Derakon
              Prophet
              • Dec 2009
              • 9022

              #7
              Originally posted by andrewdoull
              More importantly, what are the new monster types being added?

              å∑€œ®†¥
              Clearly Morgoth should be Ω.

              Comment

              • Narvius
                Knight
                • Dec 2007
                • 589

                #8
                Any otherworldly horrors should be greek letters. Which, incidentally, is implied Banishment immunity.
                If you can convincingly pretend you're crazy, you probably are.

                Comment

                • Jungle_Boy
                  Swordsman
                  • Nov 2008
                  • 434

                  #9
                  Originally posted by andrewdoull
                  More importantly, what are the new monster types being added?

                  å∑€œ®†¥
                  Isn't that last one the amulet of yendor? not that I ever found it.
                  My first winner: http://angband.oook.cz/ladder-show.php?id=10138

                  Comment

                  • david3x3x3
                    Scout
                    • Jun 2009
                    • 28

                    #10
                    I saw the notes on the wiki page regarding adapting these changes to Android. I've got a few alternative ideas.

                    a) Because the characters on the canvas are stored as wchar_t, Android only allows one byte per character for storage. Potentially we could use the overloaded Term_mbstowcs to translate from UTF-8 to ISO8859-1 so that we can still use the wchar_t for storage on Android.

                    b) We could change all the existing wchar_t types to term_wchar_t and then do something like

                    #ifdef USE_AND
                    typedef unsigned short term_wchar_t;
                    #else
                    typedef wchar_t term_wchar_t;
                    #endif

                    This would allow us to use all characters in the BMP (basic multilingual plane) range and not just ISO8859-1.

                    Option A is probably a lot less work, but might not support all the weird characters that someone might throw into the edit files.

                    Comment

                    • david3x3x3
                      Scout
                      • Jun 2009
                      • 28

                      #11
                      I've got this working on Android now. I'm going with my first idea and storing ISO-8859-1 in wchar_t. The code is in the normal place (http://code.google.com/p/angdroid). There's an installable app package at http://angdroid.org/public_ftp/.

                      I wrote this code late at night and I now see one or two problems that I'll have to fix, but I don't think that they will introduce bugs to the game. My mbstowcs() implementation is a little bit off. I need to see if wcstr is null to know when I'm returning the length without a copy (I was looking at the length parameter), and I'm not supposed to null terminate wcstr if length is less than the length of the string.

                      Comment

                      • PowerWyrm
                        Prophet
                        • Apr 2008
                        • 2986

                        #12
                        Originally posted by pav
                        Aha. Should be fine for the dump content now. Should I expect utf-8 to ever appear in character class, race and name fields? (Also it's funny that this bb software is not utf-8 enabled. When I flipped the charset on it, Timo lost his รค...)
                        Still doesn't work for PWMAngband dumps... The non-standard characters are displayed as little squares.
                        PWMAngband variant maintainer - check https://github.com/draconisPW/PWMAngband (or http://www.mangband.org/forum/viewforum.php?f=9) to learn more about this new variant!

                        Comment

                        Working...
                        😀
                        😂
                        🥰
                        😘
                        🤢
                        😎
                        😞
                        😡
                        👍
                        👎