Ticket #9416 (confirmed Bug)

Opened 7 years ago

Last modified 4 years ago

Speed up large display lists by 10x

Reported by: runyaga Owned by: nouri
Priority: minor Milestone: 4.x
Component: Archetypes Version:
Keywords: patch Cc:

Description (last modified by limi) (diff)

archetypes/skins/unicodeEncode and unicodeTestIn are like O(nn) or worse.

The fact that they are called on every character in vocabularies makes them a significant performance win if we can simply get this out of RestrictedPython.

Fortunately they are only referenced 10 times or so in my entire Plone 3.3 checkout.

I created an external method. One issue is that the usage of these AQ Methods is heavily dependent on acquisition.

i.e. eggs/Products.Archetypes-1.5.11-py2.4.egg/Products/Archetypes/Field.py:

values = [instance.unicodeEncode(v)

if we MUST keep the API (i.e. acquisition) I think we lose. The only way around that is FSExternalMethod. Because we do not want as little of the security machinery invoked as possible.

If we break the API (I do not think you want to do that) we could change calls from: context.unicodeEncode to unicodeEncode(context, value) which is nicer. then we can: from Products.Archetypes.utils import unicodeEncode

-- trac isnt allowing me to upload attachments --

# The quick goal is to create 2 functions that are called recursively
# in Security sandbox to be done in unrestricted python.
# No idea why this is not the case today in modern AT.

def unicodeEncode(self, value, site_charset=None):
    """ """
    if isinstance(value, (tuple, list)):
        encoded = [unicodeEncode(self, v, site_charset=site_charset)
                   for v in value]
        if isinstance(value, tuple):
            encoded = tuple(encoded)
        return encoded

    if not isinstance(value, basestring):
        value = str(value)

    if site_charset is None:
        site_charset = self.getCharset()

    if isinstance(value, str):
        value = unicode(value, site_charset)

    # don't try to catch unicode error here
    # if one occurs, that means the site charset must be changed !
    return value.encode(site_charset)


def unicodeTestIn(self, value, vocab):
    """ """
    if vocab is None or len(vocab) == 0:
        return 0

    charset = self.getCharset()
    value = unicodeEncode(self, value, site_charset=charset)
    vocab = [unicodeEncode(self, v, site_charset=charset) for v in vocab]

    return value in vocab

Attachments

atoptimizations.py Download (1.1 KB) - added by runyaga 7 years ago.
external methods, more optimizations could work if we had a stricter API

Change History

Changed 7 years ago by runyaga

external methods, more optimizations could work if we had a stricter API

comment:1 Changed 7 years ago by jonstahl

  • Keywords patch added

comment:2 Changed 7 years ago by alecm

  • Owner changed from alecm to nouri

I think Nouri needs to make the call on where to put this and what release(s) it's safe for, but it seems like a good idea to me. If we keep the old py scripts around with some simple deprecation warnings and explicitly use e.g. "unicodeEncode python:modules['Products.Archetypes.utils'].unicodeEncode" in the AT templates that need it, I don't thing there's actually any BBB issue to worry about. That way we keep the API but add a newer, faster, better one that we consume internally.

comment:3 Changed 7 years ago by hannosch

I thought we fixed these calls in AT 1.5 way back. At that point we did introduce the Archetypes.browser.widgets module which has specific implementations for cases which used unicodeEncode or checkSelected before.

I think a similar approach of more specialized browser views or utils.py functions for specific cases should be applicable here as well. External methods are a definitive no-go these days.

From what I can tell the unicodeEncode method is actually only used in the validate_vocabulary method in Fields.py and the displayValue script. While it's used inside the checkSelected script, this script in itself isn't used anymore but was replaced with those specialized views.

comment:4 Changed 7 years ago by limi

  • Description modified (diff)

comment:5 Changed 5 years ago by potzenheimer

Just attached a diff to #8792 that addresses a performance issue with a large vocabulary in keyword.pt where the unicodeTestIn.py script was still used which in turn uses unicodeEncode.py Restricted Python script.

comment:6 Changed 4 years ago by kleist

  • Status changed from new to confirmed
  • Version set to 3.3

Still an issue in Plone 4?

comment:7 Changed 4 years ago by hannosch

  • Version 3.3 deleted
  • Milestone changed from 3.3.x to 4.x

As long as #8792 isn't fixed this is likely still a problem as well.

Note: See TracTickets for help on using tickets.