Metadata-Version: 2.1
Name: translitcodec
Version: 0.7.0
Summary: Unicode to 8-bit charset transliteration codec
Home-page: https://github.com/claudep/translitcodec
Author: Jason Kirtland
Author-email: jek@discorporate.us
License: MIT License
Description: best-effort representations using smaller coded character sets (ASCII,
        ISO 8859, etc.).  The translation tables used by the codecs are from
        the ``transtab`` collection by Markus Kuhn.
        
        Three types of transliterating codecs are provided:
        
          "long", using as many characters as needed to make a natural
           replacement.  For example, \u00e4 LATIN SMALL LETTER A WITH
           DIAERESIS ``ä`` will be replaced with ``ae``.
        
          "short", using the minimum number of characters to make a
          replacement.  For example, \u00e4 LATIN SMALL LETTER A WITH
          DIAERESIS ``ä`` will be replaced with ``a``.
        
          "one", only performing single character replacements.  Characters
          that can not be transliterated with a single character are passed
          through unchanged. For example, \u2639 WHITE FROWNING FACE ``☹``
          will be passed through unchanged.
        
        Using the codecs is simple::
        
          >>> import translitcodec
          >>> import codecs
          >>> codecs.encode('fácil € ☺', 'translit/long')
          'facil EUR :-)'
          >>> codecs.encode('fácil € ☺', 'translit/short')
          'facil E :-)'
        
        The codecs return Unicode by default.  To receive a bytestring back,
        either chain the output of encode() to another codec, or append the
        name of the desired byte encoding to the codec name::
        
          >>> codecs.encode('fácil € ☺', 'translit/one').encode('ascii', 'replace')
          'facil E ?'
          >>> 'fácil € ☺'.encode('translit/one/ascii', 'replace')
          'facil E ?'
        
        The package also supplies a 'transliterate' codec, an alias for
        'translit/long'.
        
        Another way to use the library is to use an error handle.
        Error handles are available:
        
          * 'strict/translit/long', 'strict/translit/short', 'strict/translit/one' - similar to 'strict'
          * 'ignore/translit/long', 'ignore/translit/short', 'ignore/translit/one' - similar to 'ignore'
          * 'replace/translit/long', 'replace/translit/short', 'replace/translit/one' - similar to 'replace'
        
        These error handles above, work similarly to Python's built-in ones.
        The difference is that transliteration is attempted first.
        
          >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/long').decode('ISO-8859-2')
          'Zażółć gęślą jaźń EUR :-)?!@#'
          >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/short').decode('ISO-8859-2')
          'Zażółć gęślą jaźń E :-)?!@#'
          >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/one').decode('ISO-8859-2')
          'Zażółć gęślą jaźń E ??!@#'
          >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/long').decode('ISO-8859-2')
          'Zażółć gęślą jaźń EUR :-)!@#'
          >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/short').decode('ISO-8859-2')
          'Zażółć gęślą jaźń E :-)!@#'
          >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/one').decode('ISO-8859-2')
          'Zażółć gęślą jaźń E !@#'
        
        translitcodec Changes
        =====================
        
        0.7.0
        -----
        Released on May 8, 2021
        
        - Added support for error handles
        - Fixed conversion of the German eszett char
        
        0.6.0
        -----
        Released on December 13, 2020
        
        - Add support for Python 3.9
        
        0.5.2
        -----
        Released on January 19, 2020
        
        - Install package with setuptools
        
        0.5.1
        -----
        Released on January 19, 2020
        
        - Add python_requires to prevent installation with Python 2 packages
        
        0.5
        ---
        Released on January 18, 2020
        
        - Complete coverage of the Vietnamese alphabet
        
        - Removed Python 2 support
        
        0.4
        ---
        Released on May 11, 2015
        
        - Added Python 3 compatibility
        
        0.3
        ---
        
        Released on February 14, 2011
        
        - Fixes to the transtab table rebuilding tool.
        
        - Added translitcodec.__version__
        
        0.2
        ---
        
        Released on January 27, 2011
        
        - Resolves issue of "TypeError: character mapping must return integer,
          None or unicode" when a blank value (eg: \N{ZERO WIDTH SPACE} \u200B)
          was encoded.  Unicode blanks are now returned.
        
        - Characters in the ASCII range are no longer included in the translation
          tables.
        
        0.1
        ---
        
        Released on December 28, 2008
        
        - Initial packaged release.
        
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Utilities
Requires-Python: >=3
Description-Content-Type: text/x-rst
