معرفی شرکت ها


encoding_repair-0.7dev


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Helpers to repair encodings (especially umlauts)
ویژگی مقدار
سیستم عامل -
نام فایل encoding_repair-0.7dev
نام encoding_repair
نسخه کتابخانه 0.7dev
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Niels Ranosch
ایمیل نویسنده ranosch@mfo.de
آدرس صفحه اصلی https://bitbucket.org/niels_mfo/encoding_repair
آدرس اینترنتی https://pypi.org/project/encoding_repair/
مجوز MIT
It is alarming, that very often, special characters like umlauts break when converting through different encodings. (You might want to take a look at the German Amazon Marketplace.) A broken umlaut is still valid in the target encoding and therefore can only be detect through heuristics (magic). Version 0.5: supporting utf-8 and latin1 For a full changeset, take a look at bitbucket.org/niels_mfo/encoding_repair (bug reports will also be accepted there) A common case that breaks a special character is the following: - An input string is coded in utf-8 (which uses multibyte chars) - It is interpreted as being a valid latin1 string - Latin1 has a valid representation for nearly all bytes - Latin1 uses single-byte chars - Now both bytes of the multi-byte char are interpreted as chars - The special char broke off into two different (valid!) characters This scenario has many pitfalls: - The characters are irreversebly broken. - ... regardless of what you do with the string. - You can convert it through all encodings and the umlauts won't come back. - Only through a few heuristical replaces, this module is able to help you. This module assumes, that a few special characters are always correct. They are stored in the list 'umlauts'. Furthermore, the module assumes, that their representation, that would be correct in the other encoding, is always broken in the target-encoding. NOTE: This only happens, because people don't use unicode. If everybody would consequently use unicode strings, I would not have to write this module. The best and actually only way to handle encodings correctly is the following: - An input string comes into your programm. - If it is unicode, jump to point 6. - If it isn't, you might already need to repair umlauts. - You need to make sure, that you know the right encoding of the input string, because it is hardly possible to guess. - Convert it to unicode. - Use the unicode string throuout your whole programm. - If you can return unicode, return unicode. - If you are in doubt, return unicode. - If you really need to return anything else, return utf-8. - If you are certain, that the programm, which will take your output is not able to handle neither unicode nor utf-8, you better write a bug report.


نحوه نصب


نصب پکیج whl encoding_repair-0.7dev:

    pip install encoding_repair-0.7dev.whl


نصب پکیج tar.gz encoding_repair-0.7dev:

    pip install encoding_repair-0.7dev.tar.gz