We are migrating CKEditor issue tracking to GitHub. Please, use GitHub to report any new issues.

The former tracking system (this website) will still be available in the read-only mode. All issues reported in the past will still be available publicly and can be referenced.

Important: we decided not to transfer all the tickets to GitHub, as many of them are not reproducible anymore or simply no longer requested by the community. If the issue you are interested in, can be still reproduced in the latest version of CKEditor, feel free to report it again on GitHub. At the same time please note that issues reported on this website are still taken into consideration when picking up candidates for next milestones.

We need transformation that will extract background-color which we understand, from background which we remove. We should add these transformations for more CSS properties. I'm not only sure how can we automatically create them from e.g. config.colorButton_backStyle. We may need to add new settings for styles which would guide the filter.

We’re not supposed to drastically change the way things are done right now. Most of the changes are to happen in the cleanup logic.

The following are some notes about how things (will) work.

Basic workflow:

Listens for the “paste” with priority “3”

Sniffs the paste date for word stuff.

If from word, calls CKEDITOR.cleanWord.

Updates the paste event data with the cleaned data.

It allows to have custom "cleanWord” implementations through configuration and on-demand file loading. I’m unsure about the point about this as one could eventually simply write a plugin that would override the default plugin behavior. Anyway, we don’t have to change this right now and it can be a v5 thing.

The default cleanWord implementation will not do the deep cleanup it does today. It’ll instead “normalize” the HTML in a way that it has proper semantics, using proper HTML features. The cleanup part will be left to ACF. This is all about this ticket, in fact.

The default cleanWord implementation is based on data processing, mainly.

What to do with the current configurations:

pasteFromWordPromptCleanup: I don’t feel that this is necessary any more… actually it may still exist but by default it may depend on ACF, if it’s enabled.

pasteFromWordRemoveFontStyles: we can keep this

pasteFromWordNumberedHeadingToList: ???

pasteFromWordRemoveStyles: I have the impression that this is not needed any more. Only if we want to keep it so one could decide to have cleaner paste from Word, even if the editor support style features.

So the current idea is developing tests and write the cleanup logic from scratch, reusing what’s is reusable from the current implementation.

We're sad when we have to do that, but we weren't able to find resources for Paste From Word for the last couple of months. We have to release the drag and drop and file upload support which is ready for in ~75% as soon as possible, so I'm postponing PFW to the next major release.

A set of virtual machines has been deployed, containing the most crucial combinations of MS Word/IE versions. An application is available on each of the VMs that accepts *.docx files as an input and outputs the HTML that would normally be generated when the contents of a Word file are pasted into the editor. An additional command-line utility enables remote access to the applications.

The point of this exercise is to automatically generate fixtures for unit tests. A growing number of example *.docx files is now available for processing.

Next step: Creating the first batch of tests using the generated fixtures.

I made a quick review of the code that you pushed to ​branch:t/9991. Here's the result:

The fixtures MS Word files are missing in the repo.

We need instructions on how to generate HTML fixtures from the MS Word files (not public, because it requires our infrastructure).

The branch must be rebased before putting it on review.

Tests should not be inside tests/tickets/9991. The proper place is inside PFW plugin's dir. Note that they should be separeted from the tests for the legacy filter.

Tests for the previous filter should be tagged and configured to use the previous filter (they are failing right now).

Remember about strict mode.

Don't put a blank line at the beginning of a function or block (default.js L14, L336).

Separate blocks with a blank line (applies to blocks like elements, elementNames, all element names in default.js).

setSymbol() (switch block) – what if a list start with item "3" so no b., no II. etc. Using regexp than case item statements will be more flexible.

L138 children[0] => children[ 0 ] (no JSCS rule for that because the existing one is broken).

L159-177 - use one nope function.

However, more important is what has to be done with the task in general:

We need to make sure that the new filter is at least as good as the old one. AFAICS the existing tests are oriented around the simple cases. We need to create tests for at least some of the fixed bugs in the old PFW filter. So the first thing that we need to do is to stabilise the new filter.

The second thing is to actually implement the spec that I wrote (comment:18), because currently the new filter works in nearly the same manner as the old one, so it normalize the wordy-HTML to standard-HTML and then also it does the transformation and filtering which it should not do. So we need to take the current filter and move some of its code to ACF's transformations and filtering. This task can be done in a separate ticket, because this will be mostly transparent for the users. It will however make the PFW feature more configurable (through the ACF's settings).

I made a quick review of the code that you pushed to ​branch:t/9991. Here's the result:

The fixtures MS Word files are missing in the repo.

We need instructions on how to generate HTML fixtures from the MS Word files (not public, because it requires our infrastructure).

The branch must be rebased before putting it on review.

Tests should not be inside tests/tickets/9991. The proper place is inside PFW plugin's dir. Note that they should be separeted from the tests for the legacy filter.

Tests for the previous filter should be tagged and configured to use the previous filter (they are failing right now).

Remember about strict mode.

Don't put a blank line at the beginning of a function or block (default.js L14, L336).

We need a JSCS rule for that then, spanning the whole project.

Separate blocks with a blank line (applies to blocks like elements, elementNames, all element names in default.js).

Same as above.

setSymbol() (switch block) – what if a list start with item "3" so no b., no II. etc. Using regexp than case item statements will be more flexible.

Simply using regexps doesn't solve this. What if a list starts with an "i."? No way of telling if it's a roman numeral or a letter. And this is only the tip of the iceberg. This function was designed for the limited scope that this iteration of PFW has. I'll correct it as soon as we have the proper test cases.

transformations should ensure that target transformation can be applied. Assune a situation when ACF is set in a way that it allows font[http://docs.ckeditor.com/#!/api/CKEDITOR.filter-method-addTransformations face] but does not allow span{font-family} with current implementation it would be converted to span (no matter what) and then... removed by ACF. You can do this with [check property].

you shouldn't create new CKEDITOR_MOCK global variable. Instead create CKEDITOR.plugins.pastefromword static namespace and put methods over there. However mark members that you don't want to be used outside of pfw as "@private".

Now within this namespace, create members like CKEDITOR.plugins.pastefromword.lists, where all list-related members should go.

createLists is too long and does too many things. You should extract multiple parts:

I've been reviewing PFW again for a while now, and I found a case where pasting content would cause an exception. It's easy to fix for me, however I need to prepare proper tests to prevent us from regressions.

@Tade0 could you please provide some sort of instructions on generating a proper Word test?

Test Custom_list_markers

I've fixed a bug where this TC would cause an exception. With that I discovered couple of things:

List should be converted to unordered list (it's identified as such in MS Word). Instead it's converted to two ordered lists, and list gets start value 19. That's actually a regression to a current implementation.

Since we're not supporting custom markers yet, we should filter out custom markers. Those could be recognized by:

Checking if src starts with "​file://" protocol and have an alt equal to "*".