Ticket #9186 (closed PLIP: fixed)

Opened 7 years ago

Last modified 6 years ago

Set Image IDs from Title field

Reported by: erikrose Owned by: erikrose
Priority: minor Milestone: 4.0
Component: Archetypes Version:
Keywords: Cc: plip-advisories@…, yurj

Description (last modified by erikrose) (diff)

Proposed by: Erik Rose
Seconded by: (Step right up!)

The Problem

The Image type chooses the IDs of new objects differently than other types: rather than transforming the Title ("My Dog Watches Me Eat" becoming "my-dog-watches-me-eat"), they use the name of the uploaded file. This has several disadvantages:

  1. Inconsistency with other types. 8.5 (yes, there was a half) out of 14 of our Plone user's group attendees today find the current behavior surprising or counterproductive. Only 1 actively prefers the current behavior. The rest don't see any problem in changing it.
  2. Many common use cases involve images whose filenames are autogenerated and uninformative. All these cases result in URLs like Picture%201.PNG, which are uninformative, subjectively funny looking, and hard to remediate without first renaming and then re-uploading a large file:
    • Screenshots on a Mac come with titles like Picture 1 and, because their lifetimes on disk are typically short, aren't often renamed.
    • Pictures from cameras look like P1090404.JPG.
    • Flickr images look like 3425573738_90e84302e8.jpg.
  3. When screenshots are routinely uploaded to the same folder (say, a site-wide images folder), everybody but the first uploader of Picture 1.png is met with an error—"There is already an item named Picture 1.png in this folder."—and has to do a context switch back to another app to rename the file (which, if the user had good reason to name the local file "Picture 1", is pretty intrusive).
  4. The case of image extensions differs depending on platform. Windows tends to use capital .JPG; Mac, .jpg. Such inconsistency makes URLs hard to guess.
  5. There's no other way in Plone, short of manually enabling short-name editing, to end up with capital letters in URLs.

The Proposal

  • Have Images choose their IDs based on entered Titles, as with other types.
  • The Title field remains unrequired, thus maintaining happiness with uploads via WebDAV or FTP, which don't provide opportunity to enter a title.
  • On creation form submission, the object's ID is assigned as follows:
    • If the Title field is the same as the name of the uploaded file, the ID is the contents of the Title field, without going through the typical Title-field-normalization filter. This maintains the current behavior as an option.
    • If the Title field is different than the name of the uploaded file, the ID is the normal munged Title field contents.
  • If browsers are awful and require file extensions in URLs (which they don't), compute them from the uploaded MIME type using the existing guess_content_type() machinery or something. This solves the inconsistency where some files say .jpg, others .jpeg, and still others .JPG.

Addendum: the File type

P.S. Though this PLIP shouldn't succeed or fail based on this, we might also consider making a similar change to the behavior of the File type. As of the first closing of this ticket, the File type has not been changed.

 http://dev.plone.org/collective/browser/Products.ATContentTypes/branches/plip9186-image-ids-from-titles

Change History

comment:1 Changed 7 years ago by erikrose

  • Description modified (diff)

comment:2 Changed 7 years ago by rossp

  • Cc rossp added

comment:3 Changed 7 years ago by erikrose

  • Description modified (diff)

comment:4 Changed 7 years ago by erikrose

  • Description modified (diff)

Added a little UI sugar when reverting to the old behavior.

comment:5 Changed 7 years ago by erikrose

  • Description modified (diff)

Let's make Title required since we have that JS sugar now.

comment:6 Changed 7 years ago by erikrose

  • Summary changed from Respect Title field when assigning IDs to Images to Set Image IDs from Title field

comment:7 follow-up: ↓ 8 Changed 7 years ago by kleist

Suggestion

if title is provided:

auto-generate the id from the title

else:

auto-generate the id from the file name

comment:8 in reply to: ↑ 7 Changed 7 years ago by erikrose

Replying to kleist:

if title is provided:

auto-generate the id from the title

else:

auto-generate the id from the file name

Yep, that's what effectively happens given the JS Title-filling-in. :-)

comment:9 Changed 7 years ago by erikrose

  • Milestone changed from Trunk to 4.0

Now that we've figured out our version numbering, I can put this in a proper milestone. :-)

comment:10 Changed 7 years ago by erikrose

  • Description modified (diff)

Noted that guess_content_type() already exists and probably provides the extension-guessing we'd need.

comment:11 follow-up: ↓ 12 Changed 7 years ago by alecm

Why bother with the JS (since file upload stuff is very browser specific)? Using the title if set, otherwise using the filename (sanitize either way) seems less brittle and equally user friendly.

Browsers generally don't care about extensions, if they did all our /image_thumb urls for image scales would be a major issue. Some search engines do care about extensions though, and web servers take them into account for some things (which occasionally makes the /image_thumb stuff a pain point).

I'd suggest adding at least minimal filename sanitization for files (not as extreme as what we use for titles). I have a feeling people generally want to keep their filenames intact. However, having the auto-filled JS title/id might make this acceptable for files. We need to be careful that we don't break WebDAV expectations when doing sanitization, etc.

comment:12 in reply to: ↑ 11 Changed 7 years ago by erikrose

Replying to alecm:

Why bother with the JS (since file upload stuff is very browser specific)?

Better user feedback. The user is shown what is about to happen, rather than having to read it in a manual or just submit and see what happens.

What kind of browser-specific weirdness does file upload have? I wasn't aware.

Browsers generally don't care about extensions, if they did all our /image_thumb urls for image scales would be a major issue.

Good point. Hooray! That removes the biggest risk to this proposal, as MIME type guessing is heuristic at best.

I'd suggest adding at least minimal filename sanitization for files (not as extreme as what we use for titles). I have a feeling people generally want to keep their filenames intact.

I suspect you're right w.r.t. keeping filenames intact. Anyone else have an opinion?

What kind of sanitation are you proposing? Anything beyond what currently goes on (if anything)?

However, having the auto-filled JS title/id might make this acceptable for files. We need to be careful that we don't break WebDAV expectations when doing sanitization, etc.

I welcome input about WebDAV, as I know barely anything about Plone's support for it.

comment:13 Changed 7 years ago by erikrose

  • Description modified (diff)

comment:14 Changed 7 years ago by pupq

A big +1 from integrators. AND this should happen with files, too -- it would be confusing and inconsistent for this to be different just for images.

I like the JS idea--we need more immediate-feedback about what would happen than upload-and-see.

comment:15 Changed 7 years ago by erikrose

  • Description modified (diff)

Took into account alecm's astute observation that browsers don't give a rip about extensions.

comment:16 Changed 7 years ago by limi

I am wary of renaming people's files. That's often where they keep their version of "metadata", I'm sure you have seen filenames like "RFP, v3" or "March 31 meeting".

People do this less with images, I agree — but I'm not sure there's a compelling upside here that outweighs the risk of losing metadata that might be in the file name.

comment:17 Changed 7 years ago by erikrose

  • Owner nouri deleted

Clearing Owner field of 4.0 PLIPs so we can use it to mean "implementor". (Many of these owners were automatically assigned from choosing a Component that had a default owner.)

comment:18 Changed 7 years ago by smcmahon

  • Cc plip-advisories@… added; rossp removed

comment:19 Changed 7 years ago by davisagli

+1 but with a caveat: the JS-filling of the Title needs to be tested and work across all supported browsers. If it's impossible to get that to work consistently, then I think we need to keep the Title field optional. In that case we could still apply the logic of using a normalized version of the Title if it is supplied, which would be a minor improvement over always using the filename.

We should also fix issue 3 (error about existing file if you upload a file with a name that is already in use) regardless of whether this PLIP gets accepted.

comment:20 Changed 7 years ago by MatthewWilkes

FWT Vote: +1

comment:21 Changed 7 years ago by rossp

My FWT vote is -1. I'm with limi on this one. I think there's good reason for uploaded content like files and images to behave differently than other content objects since we have data that is unique to that usage, the filename. It just seems like any of the options discussed here would introduce new surprises elsewhere and I don't have enough of a sense of a win here.

comment:22 Changed 7 years ago by raphael

As long as the title isn't made required (irrespective of JS tricks) I could accept this but please keep an eye on FTP and WebDAV usage as well. Doing bulk file and image uploads is the number one use case for FTP/WebDAV and we certainly don't want to loose that.

I'm tempted to say +1 for trying this - but I'll need to look and test carefully should that get submitted.

comment:23 Changed 7 years ago by calvinhp

FWT Vote: -1 I'm also worried that it will break many peoples process when using FTP or WebDAV to upload images if we change the ID on them during the process. They will potentially try to upload again and will have multiple items and no feedback about this vs just updating an image/file that already exists.

comment:24 Changed 7 years ago by erikrose

Abstaining since this is my PLIP.

comment:25 Changed 7 years ago by erikrose

  • Description modified (diff)

Revised to dodge Title field normalization in certain cases, as discussed on FWT telecon.

comment:26 Changed 7 years ago by erikrose

  • Description modified (diff)

comment:27 Changed 7 years ago by esteele

Approved by FWT vote.

comment:28 Changed 7 years ago by esteele

  • Owner set to erikrose

comment:29 Changed 7 years ago by yurj

  • Cc yurj added

JS should be used only to be more informative, we should not rely on it.

We could provide a checkbox with

[] Use the file name instead of the title for the image url

This can leave room for people which likes to mantain file names.

What if I change the title? The id will not be changed, so we will end up in inconsistencies again. Also, if you want to use webdav, you will upload a file and have no way to set the title to change the id, so the problem is not solved again.

comment:30 Changed 7 years ago by erikrose

  • Status changed from new to assigned

comment:31 Changed 7 years ago by erikrose

If we do the same for Files as for Images, make sure downloading a Word doc still results in a file ending in ".doc" somehow, lest it be unopenable on Windows.

comment:32 Changed 7 years ago by erikrose

(In [28496]) Added buildout config for image-ID-from-title PLIP. Refs #9186.

comment:33 Changed 7 years ago by erikrose

Having JS fill in the Title field had good intentions, but it's actually misleading; it looks like the title would be set to the filename (as it probably would be—not a good thing). Now I'm thinking about a straightforward "Use file's name as Short Name" (or some better language) checkbox.

On an only tangentially related note, I notice that, when short name editing is on, the Short Name field shows up for Images but is completely ignored. I'll try to fix that, too.

comment:34 Changed 7 years ago by alecm

I thought the consensus was Title field optional, if filled use title for the id (for images not files) otherwise use filename. No js. Am I misremembering?

comment:35 Changed 7 years ago by erikrose

That sounds pretty good to me. The only downside is that you lose the present ability to upload an image, use the filename as the ID, and, at the same time, give it an unrelated Title—you'd have to leave Title blank when uploading, then come back and edit it.

comment:36 Changed 7 years ago by erikrose

Collective changeset 94438: Refactored so I don't repeat myself anymore. Now _setATCTFileContent is parametrized by a _should_set_id_to_filename method in its subclasses, ATFile and ATImage.

comment:37 Changed 7 years ago by erikrose

Make sure unicode filenames work right. I'm suspicious of my test for clean_filename == title.

comment:38 Changed 7 years ago by erikrose

  • Status changed from assigned to closed
  • Resolution set to fixed

In  collective changeset 94846:

  • Stopped errors about duplicate filenames even if the title is going to be used as the ID.
  • Fixed a failing test due to the uploaded file being at EOF to begin with.

comment:39 Changed 7 years ago by esteele

  • Status changed from closed to reopened
  • Resolution fixed deleted

I'll close PLIP tickets when they've been merged or rejected.

Before this is considered ready for review, we'll need you to add a set of review instructions to the plips folder in the 4.0 buildout.

Thanks.

comment:40 Changed 7 years ago by erikrose

  • Description modified (diff)

Removed reference to JS, which was a bad UI idea.

comment:41 Changed 7 years ago by erikrose

  • Description modified (diff)

comment:42 Changed 7 years ago by alecm

(In [29329]) Add review for PLIP #9186 (refs #9186)

comment:43 Changed 7 years ago by davisagli

I looked into the FTP connection issue and filed a Zope bug at  https://bugs.launchpad.net/zope2/+bug/418454 (it's an incompatibility between Zope's FTP server and a change in the asyncore module in Python 2.6).

comment:44 Changed 7 years ago by erikrose

(In [29558]) Put CMFTestCase back in the buildout; apparently, it was necessary after all. (Why didn't the tests fail after I ran buildout without it?) Refs #9186.

comment:45 Changed 7 years ago by esteele

Your PLIP has been reviewed by the Framework team. Feel free to discuss any suggested changes either here in the PLIP ticket or on the mailing lists. Final deadline for this PLIP is set for September 23.

comment:46 Changed 6 years ago by aclark

(In [30086]) Updated review: +1 for inclusion. Would be nice to get the 'File' type to behave this way too, before Plone 4 final. refs #9186

comment:47 Changed 6 years ago by MatthewWilkes

FWT vote: +1

comment:48 Changed 6 years ago by erikrose

(In [30229]) Updated to describe the improved implementation. Refs #9186.

comment:49 Changed 6 years ago by rossp

FWT vote: +1 for merge

comment:50 Changed 6 years ago by esteele

This PLIP has been accepted for merging into Plone 4.0

The final vote was: Alec Mitchell +1 David Glick +1 Erik Rose - Laurence Rowe +1 Matthew Wilkes +1 Ross Patterson +1

Please merge your branches into the Plone 4.0 head by end-of-day Friday Oct 16. If you need assistance with merging, please contact me.

We'll be assigning a documentation ticket to this PLIP shortly. Please assist the docs team in documenting the changes and new features that this PLIP introduces.

comment:51 Changed 6 years ago by esteele

Please assist the doc team in creating/updating documentation relating to this PLIP. See #9601

comment:52 Changed 6 years ago by esteele

  • Status changed from reopened to closed
  • Resolution set to fixed

comment:53 Changed 6 years ago by erikrose

After further discussion with esteele and polls taken at the Plone Conference, we've decided to implement ID-from-title behavior on the File type as well.

comment:54 Changed 6 years ago by erikrose

Both are done. Thanks, davisagli, for porting this to work with plone.app.blob while I wasn't looking!

Note: See TracTickets for help on using tickets.