Ticket #13058 (confirmed Bug)
Plone does not handle properly some common quoting character in titles
Reported by: | keul | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | 4.x |
Component: | General | Version: | 4.2 |
Keywords: | Cc: | keul |
Description
Is quite common (in languages like Italian and for users that copy/paste page title from Microsoft Word) to use quoting character that are not the classic " character, but a form like “ and ” (hex codes: “ and ”).
With some languages MSWord changes a sentence like *"foo document"* to *“foo document”*.
When this copy/paste is used for a document title, the document id is changed to something not really readable: for the example above: *201cfoo-document201d*.
Looking at the code: all this seems handled by the plone.i18n module.
Inside the plone.i18n.normalizer.baseNormalizer function there's a call to decomposition(ch). This function (Python standard) says:
Returns the character decomposition mapping assigned to the Unicode character unichr as string. An empty string is returned in case no such mapping is defined.
So: seems that there is no ASCII alternative defined for those two character (is a bug in Python? I think I will open a bug report there).
Can be a bad idea fixing this in Plone?
Change History
comment:2 follow-up: ↓ 3 Changed 4 years ago by kleist
- Status changed from new to confirmed
- Component changed from Unknown to General
Maybe something like this? http://myzope.kedai.com.my/blogs/kedai/128
comment:3 in reply to: ↑ 2 Changed 4 years ago by keul
Replying to kleist:
Maybe something like this? http://myzope.kedai.com.my/blogs/kedai/128
Yes, I think something like this!
BTW: seems that is not a Python bug (I'm a total ignorant about Unicode :-): http://bugs.python.org/issue15372?
comment:4 Changed 3 years ago by keul
Any suggestion about how to fix this problem?
Today I found another similar issue: if you call putils.normalizeString("Forlì") you'll get "forla" and not "forli" as expected. I think is more or less the same issue above.
I'd like to take a look at this but I need some general direction.