Ticket #9376 (closed PLIP: wontfix)

Opened 7 years ago

Last modified 7 years ago

Include archetypes.schematuning

Reported by: jensens Owned by: jensens
Priority: minor Milestone: 4.0
Component: Archetypes Version:
Keywords: Cc:

Description (last modified by jensens) (diff)

Proposer: Hedley Roos
Seconder: Jens W. Klein

Motivation

The Schema method of BaseObject is looked up several times per request for every object taking part in the request. This method is very heavy and leads to performance problems. Caching solves the problem.

Assumptions

Products.Archetypes.BaseObject.BaseObject.Schema is called too many times and this costs.

Proposal & Implementation

This is already implemented  http://pypi.python.org/pypi/archetypes.schematuning needs to be ported to Archetypes itself. What needs to be done, and how should it be done?

Deliverables

Updated unit tests for Archetypes are already part of archetypes.schematuning (see pypi or svn). This need to be merged.

It uses plone.memoize to cache Archetypes Schemas instead of factoring and modifying them every time its accessed. Factoring a schema is usally fast anyway. I.e. in an average Plone Site a, ATDocuments schema is accessed per request 80 times. So caching it makes sense anyway, even if a single call is fast. With schematuning ATDocument in PythonProfiler (which linear slows down whole Python) came down from 1.518s down to 0.084s for Schema calls. This makes it roughly 18 times faster.

Risks

Third party applications which modify the schema of a content type in code run the risk of being returned an old schema in subsequent calls. A method is provided to invalidate a schema in these cases. The applications would need to be modified.

Applications making use of the _updateSchema from BaseObject will continue to work.

Participants

The proposers and Eric Steele who pushed to bring this onto this tracker.

Progress

 http://pypi.python.org/pypi/archetypes.schematuning version 1.1 works since January 2009 and was downloaded over 1600 times.

Hedley did work to make it browserlayer aware, this is in svn, see https://dev.plone.org/archetypes/browser/archetypes.schematuning/trunk

Change History

comment:1 Changed 7 years ago by jensens

  • Description modified (diff)

comment:2 Changed 7 years ago by jensens

  • Owner changed from nouri to jensens

comment:3 Changed 7 years ago by raphael

As long as the schemaextender continues to work  http://pypi.python.org/pypi/archetypes.schemaextender I'm +1 on merging this into AT proper.

comment:4 Changed 7 years ago by hannosch

Things to watch out for: do marker interfaces influencing the schema still work?

Also using plone.memoize in Archetypes itself isn't possible, thanks to licensing. Archetypes is BSD and memoize GPL.

comment:5 Changed 7 years ago by davisagli

+1 as long as the cachekey includes the Plone site path, and as long as this feature can be disabled entirely via an environment variable

comment:6 Changed 7 years ago by erikrose

+1. schemaextender does continue to work.

comment:7 Changed 7 years ago by erikrose

And in answer to Hanno, yes, marker interfaces are taken into account:

def cache_key(fun, obj):
    directifaces = directlyProvidedBy(obj).flattened()
    ifaces = [iface.__identifier__ for iface in directifaces]
    return frozenset([obj.__class__.__name__, obj.portal_type] + ifaces + [id(obj.__class__)])

comment:8 Changed 7 years ago by alecm

We should certainly have schema caching. However, I think we need schemas cached per instance, e.g. with a _v_attribute or RAMCached by object path. This wouldn't provide as much performance advantage, but would avoid some incompatibilities.

comment:9 Changed 7 years ago by hannosch

Why do we really need schema caching? In standard Archetypes in the most common scenario the schema is just a class variable, pre-calculated at startup. In that scenario any type of caching just introduces additional overhead in terms of both computation and memory usage.

Once you use schemaextender you certainly can benefit from caching, as the additional dynamic schema calculation can get quite expensive. I wonder if we shouldn't provide a couple of default schema cache variants taking some different combinations into account as part of schemaextender. It would be an explicit opt-in from the developer using the package to decide if caching is worthwhile.

In our projects using schemaextender we tend to have quite simple schema caches using just class and portal_type most of the time. Calculating all those potential marker interfaces on an instance is somewhat expensive in itself.

We also happen to have a case of using persistent per instance schemas without schemaextender. Caching in that case is just stupid, as the schema is almost never the same anyways and so there's not much that could be shared.

I think we either need to have a very simple schema cache which works in 80% of the cases and you can opt-out of it or we need different variants tuned at different situations and can opt-in to those.

comment:10 Changed 7 years ago by hedley

Sorry for being so late to this thread.

In standard Archetypes in the most common scenario the schema is just a class variable, pre-calculated at >startup.

I was not aware of that. I know schemaextender leads to repeated calls to the Schema method but I did not know plain AT schemas do not have that effect.

Calculating all those potential marker interfaces on an instance is somewhat expensive in itself.

I have never benchmarked the actual cache key method but it is much much cheaper than a call to the Schema method.

We also happen to have a case of using persistent per instance schemas without schemaextender. Caching in >that case is just stupid, as the schema is almost never the same anyways and so there's not much that could >be shared.

Agreed, but that case is the exception and not the norm.

I think we either need to have a very simple schema cache which works in 80% of the cases and you can opt-out >of it or we need different variants tuned at different situations and can opt-in to those.

That would be ideal and I'm open for suggestions. Since I have never used per instance schemas can you point me to a product that does so?

comment:11 Changed 7 years ago by hedley

(In [29170]) Create documentation refs #9376

comment:12 Changed 7 years ago by erikrose

Some numbers: we're instantiating a few hundred new FacultyStaffDirectory Person objects, along with Relations relationships and schemaextender use, from a script (so no template rendering or anything).

Without schematuning: between 0.1-1 Persons per second
With: 5-10 per second

comment:13 Changed 7 years ago by kevin7kal

Here are some [  http://weblion.psu.edu/news/content-editing-and-creation-in-plone-is-faster benchmarks for creating and editing schema extended archetypes objects with and without archetypes.schematuning ] installed.

comment:14 Changed 7 years ago by alecm

(In [29484]) Review for PLIP #9376 (refs #9376)

comment:15 Changed 7 years ago by erikrose

I've had this sort of thing happen twice in the last few days on two different boxes with schematuning installed:

TypeError

('Could not adapt', <CacheTool at /agsci.psu.edu/portal_cache_settings used for /example.psu.edu/document_homepage_view>, <InterfaceClass Products.Archetypes.interfaces._schema.ISchema>) (Also, the following error occurred while attempting to render the standard error message, please see the event log for full details: ('Could not adapt', <CacheTool at /example.psu.edu/portal_cache_settings used for /example.psu.edu/default_error_message>, <InterfaceClass Products.Archetypes.interfaces._schema.ISchema>))Traceback (innermost last):

Module ZPublisher.Publish, line 202, in publish_module_standard
Module ZPublisher.Publish, line 150, in publish
Module Zope2.App.startup, line 221, in zpublisher_exception_hook
Module ZPublisher.Publish, line 119, in publish
Module ZPublisher.mapply, line 88, in mapply
Module ZPublisher.Publish, line 42, in call_object
Module Shared.DC.Scripts.Bindings, line 313, in __call__
Module Shared.DC.Scripts.Bindings, line 350, in _bindAndExec
Module Products.CMFCore.FSPageTemplate, line 216, in _exec
Module Products.CacheSetup.patch_cmf, line 28, in FSPT_pt_render
Module Products.CacheSetup.content.cache_tool, line 271, in getEnabled
Module Products.Archetypes.BaseObject, line 241, in getField
Module Products.archetypes_schematuning.archetypes.schematuning.patch, line 49, in Schema
Module plone.memoize.volatile, line 272, in replacement
Module Products.archetypes_schematuning.archetypes.schematuning.patch, line 42, in _Schema

We should probably track that down before getting too committed.

comment:16 Changed 7 years ago by alecm

That's an odd one indeed. What were you doing when you saw that error? It's unable to lookup a schema for portal_cache_settings, but it's not clear why or why it would need to do that. It seems like the sort of thing that's not likely to have been caused directly by this patch though, unless somehow the definition of self changed during the schema lookup process. In any case, this caching implementation appears to be ill-conceived and targeted at entirely the wrong place.

comment:17 Changed 7 years ago by ldr

Discussion on irc confirms that schematuning is only relevant when schemaextender is being used, so belongs in that package rather than core. plone.app.blob depends on schemaextender and is proposed for inclusion, this dependency should then presumably be factored out of plone.app.blob.

comment:18 Changed 7 years ago by esteele

  • Status changed from new to closed
  • Resolution set to wontfix

The Framework Team feels that the changes recently made to archetypes.schemaextender render this PLIP moot. Closing.

Note: See TracTickets for help on using tickets.