The Content Handler Plugin
After reading this guide, you will be able to:
- Understand what Content Handler are and how to use them
- Use the Content Handler API to create, modify and
- Extend Content Handler with custom implementations
This guide is currently work-in-progress.
1 What are Content Handlers?
Aloha Content Handlers (Content Handler) are used to sanitize content loaded or pasted in Aloha Editor.
There are some Content Handler available:
- Generic
- Word
- oEmbed
- Sanitize
Plugins also provide Content Handler:
- Block common/block plugin
There are four hooks available:
- insertHtml in plugin common/commands
- initEditable in core/editable.js
- getContents in core/editable.js
- smartContentChange in core/editable.js
2 Enabling the Content Handler Plugin
2.1 Using Configuration
Aloha.settings.contentHandler = { insertHtml: [ 'word', 'generic', 'oembed', 'sanitize' ], initEditable: [ 'sanitize' ], getContents: [ 'blockelement', 'sanitize', 'basic' ], sanitize: 'relaxed' // relaxed, restricted, basic, allows: { elements: [ 'strong', 'em', 'i', 'b', 'blockquote', 'br', 'cite', 'code', 'dd', 'div', 'dl', 'dt', 'em', 'i', 'li', 'ol', 'p', 'pre', 'q', 'small', 'strike', 'sub', 'sup', 'u', 'ul'], attributes: { 'a' : ['href'], 'blockquote': ['cite'], 'q' : ['cite'] }, protocols: { 'a' : {'href': ['ftp', 'http', 'https', 'mailto', '__relative__']}, // Sanitize.RELATIVE 'blockquote': {'cite': ['http', 'https', '__relative__']}, 'q' : {'cite': ['http', 'https', '__relative__']} } }, handler: { generic: { // this will disable the transformFormattings method in the generic content handler transformFormattings: false, // settings this configuration properteis will // enable conversion of copyied elements to specified elements in the clipboard // this can be used to counter Chrome's copy&paste bug on styled elements transformFormattingsMapping: [ // this will make sure a copied b element will be be pasted again as a b element // Chrome would try to change it to a span with a style attribute of 'font-weight: 700' // // Note: the attributes to check for are dependent on the style applied to the copied element. // for example to detect sup and sub elemenst it is suggested to add detectable font-size values (75% and 75.01%) { nodeNameIs: 'span', nodeNameShould: 'b', attribute: { name: 'style', value: 'font-weight: 700' } }, { nodeNameIs: 'span', nodeNameShould: 'sup', attribute: { name: 'style', value: 'font-size: 12.6px' } } ] }, sanitize: { '.aloha-captioned-image-caption': { elements: [ 'em', 'strong' ] } } } }
For the sanitize Content Handler you can use a predefined set of allowed HTML-tags and attributes (“sanitize” option) or your own set defined as “allows” option. The configuration option for “insertHtml” and “initEditable” defines which Content Handler should be used at that hook points. “initEditable” uses the “sanitize” Content Handler by default. “insertHtml” uses all registered Content Handler by default (including the Content Handler from other Plugins)
The order how the handlers are loaded and executed is important. If you first load the “sanitize” or “generic” handler the “word” handler will not be able to detect a MS Word document.
3 APIs and Extension Points
There is the Content Handler Manager in Aloha Core available where Content Handler can register themself.
3.1 Writing your own Content Handler
define( ['aloha', 'jquery', 'aloha/contenthandlermanager'], function(Aloha, jQuery, ContentHandlerManager) { "use strict"; var MyContentHandler = ContentHandlerManager.createHandler({ handleContent: function( content ) { // do something with the content return content; // return as HTML text not jQuery/DOM object } }); return MyContentHandler; });
All Content Handler needs to support the “handleContent” method and will receive and return the content as HTML text.
4 Internals
5 Future Work
6 Changelog
- October 06, 2011: Initial version by Rene Kapusta
From plugins.texile
7 Contenthandler Plugin
The Contenthandler Plugin has no user interface and is used in conjunction with the paste plugin to be able to handle pasted content from Microsoft Word and the like. Currently it provides three so called “Content Handlers” which will be used to cleanup html content on various occasions like pasting or when initializing an editable. Those contenthandlers are:
- word
- generic
- sanitize
7.1 Word Content Handler
The Word Content Handler will detect content pasted from Microsoft Word 2003 and newer versions. It will:
- remove all “mso-*” classes
- remove all divs and spans from the pasted content
- remove font tags
- convert bullet points to unordered lists
- transform titles to h1 and subtitles to h2 tags
- keep bold and italic formatting, headlines, tables and lists but get rid of all other formattings
- Remove links to local files (with file :// URLs), but it preserves the link contents
Word supports much more formatting options than Aloha Editor. Therefore the content pasted from Word into Aloha Editor will be different, if its formatting or structure in Word is not supported by Aloha Editor. Additionally, the HTML produced by Word when doing Copy & Paste is generally of poor quality (or even invalid) and different depending on the used browser version. Therefore, the resulting content can also be different across browsers and may not reflect the formatting or structure of the original content in Word.
7.2 Generic Content Handler
The Generic Content Handler is a bit less generic than his name might suggest as he will apply the following cleaning actions:
- clean lists by removing any non-ul or li children
- cleaning tables by removing the following attributes
- border
- cellpadding
- cellspacing
- width
- height
- valign
- remove comments
- unwrap contents of div, font and span tags (unless the span’s are language annotations made by the wai-lang plugin)
- remove styles
- remove namespaced elements
- transform formattings by turning
- <strong> tags into <b> tags
- <em> to <i>
- <s> and <strike> to <del>
- and by removing <u> tags (as underline text decoration should solely be used for links)
7.2.1 Use Generic Content Handler to repair Chrome copy & paste issues
The generic content handler allows can be configured to transform elements with specific attributes into other elements. This is used to counter Chrome 75’s copy paste issues in contenteditable.
Chrome will try to maintain styling on the copied range. Depending on the styling of the copied element this may lead to <b> elements turning into <span style="font-size: 700"> As of now Chrome is unable to copy and elements in contenteditable. This leads to and elements turning into spans.
7.2.1.1 Example Styling
b { font-weight: 700; } sub, sup { font-size: 75%; } sup { /* only needed in editmode - use to to tell difference by carried over style attribute */ font-size: 75.01% }
7.2.1.2 Example config for repairing Chrome copy paste issues
Aloha.settings.contentHandler.handler.generic.transformFormattingsMapping: [ { nodeNameIs: 'span', nodeNameShould: 'b', attribute: { name: 'style', value: 'font-weight: 700' } }, { nodeNameIs: 'span', nodeNameShould: 'sup', attribute: { name: 'style', value: 'font-size: 12.6017px' } }, { nodeNameIs: 'span', nodeNameShould: 'sub', attribute: { name: 'style', value: 'font-size: 12.6px' } } ];
7.3 Sanitize Content Handler
The Sanitize Content Handler does not work reliably on IE7, and will therefore just do nothing, if this browser is detected.
The Sanitize Content Handler will remove all dom elements and attributes not covered by it’s configuration. You may specify your own configuration based on these default settings:
Aloha.settings.contentHandler.sanitize = { // elements allowed in the content elements: [ 'a', 'abbr', 'b', 'blockquote', 'br', 'cite', 'code', 'dd', 'del', 'dl', 'dt', 'em', 'i', 'li', 'ol', 'p', 'pre', 'q', 'small', 'strike', 'strong', 'sub', 'sup', 'u', 'ul' ], // attributes allowed for specific elements attributes: { 'a' : ['href'], 'blockquote' : ['cite'], 'q' : ['cite'], 'abbr': ['title'] }, // protocols allowed for certain attributes of elements protocols: { 'a' : {'href': ['ftp', 'http', 'https', 'mailto', '__relative__']}, 'blockquote' : {'cite': ['http', 'https', '__relative__']}, 'q' : {'cite': ['http', 'https', '__relative__']} } }