Has Citidesk started doing what the earlier version of Frontpage used to do?

Karl Max
Friday, August 29, 2003

Steven seems to think the CityDesk designers have gone out of their way to get the 'HTML fixer' to bang all HTML code into shape and force it into a consistent style.

I think the problem is more complex. It's not easy to give someone both a HTML view of their code, and provide a WYSIWYG editor that allows them to edit the code visually while preserving their individual HTML style. How does the program know if you prefer uppercase or lowercase <p> tags? How does it know if you like to close paragraphs with </p> tags or not? And these are just simple examples - the examples in the article above are even more complex.

The CityDesk guys have done well just to get to the point where they are now. It would take a lot more time and effort to get the article editor to behave the way Steven wants, and he even said in the same article that engineering is always a game of juggling tradeoffs.

I'm sure CityDesk (and the article editor) will get better with each release. But I'd rather get small improvements every 6 months or so than one big improvement in 18 months' time. If they wait that long, someone else will come along and eat their lunch (to paraphrase Joel).

Darren Collins
Friday, August 29, 2003

I think Darren is missing the point
I think Steven den Beste is overreacting
But that's just what I think, think nothing of it.

The question is (I think): Why does CD have a HTML view in the article editor? To give you the ability to tweak your content exactly to your liking as a WYSIWYG editor is easy to use but it also hides some/alot of things you can do with HTML.
So if I use the HTML view to do my "questionable" HTML tweaking, and switching to "Normal" view and back to HTML view makes CD "correct" my tweaks, what then is the point of having the HTML view in the first place?

Any thoughts?

Geert-Jan Thomas
Friday, August 29, 2003

I did not test ampersand-escape character encodings in CityDesk 1, but I've been using the <small> tag around multiple paragraphs as long as I've been using the tool (since December of 2000). CityDesk 1.0 had no trouble doing WYSIWYG with my code that way.

The WYSIWYG mode uses the standard IE HTML renderer to produce the screen text. It's available as a DLL. In fact, the actual program "Internet Explorer" is pretty small; it's just a container program that uses a lot of huge and powerful DLLs, most of which are available for other programs to use, too. FrontPage also uses that renderer, and all the mail programs which know how to display HTML-encoded email (e.g. Eudora and Outlook) use it too. And so does CityDesk.

That renderer has no trouble displaying multiple paragraphs enclosed within a single <small></small> pair. I know that because I've been doing it on my site for a year and a half and viewing the result with IE, and it looks fine. (It also looks fine in Mozilla and in Opera.)

IE also correctly handles the HTML 2.0 ampersand-encodings for special characters. (I just tried several of them out by creating a file with Wordpad and displaying it in IE6, and they all look fine. Those also look fine in Mozilla and Opera.)

CityDesk's WYSIWYG mode should also should be able to handle characters encoded that way, because it already does for the "<" (&lt;) and ">" (&gt;) characters when they appear in text. If those were translated into the literal characters the way the others are now, they'd be treated as tag delimeters by the browser instead of being displayed literally. The only way they can appear in the WYSIWYG mode (or be displayed when the page is viewed) is if they are encoded as ampersand-escape strings, which indeed they are. In fact, when you type a "<" into the WYSIWYG mode, it converts it into &lt; in the HTML code it generates.

Sorry, I don't accept the idea that this was actually needed in order to make WYSIWYG mode work. I'll take a lot of convincing that it was actually required for any reason at all.

Steven Den Beste
Friday, August 29, 2003

I think Steven's main pain is coming from the fact that CityDesk generates valid xhtml now if you use the 'normal' view. Things like this:

<small>
<p>text of first paragraph</p>
<p>text of second paragraph</p>
</small>

aren't actually valid html (incline elements such as 'small' aren't allowed inside block level elements such as 'p') so the 'normal' view changes them. I know that most people couldn't care less about validating their site, but at least now CityDesk is being consistent and will always generate validatable xhtml so you know where you stand. In v1 it was anybody's guess what would be produced!

You can please all of the people some of the time.... and so on.

And I guess HTML view is just a throw-back to v1 and since using v2 I've stopped using it anyway.

John C
Friday, August 29, 2003

The difference may be that if you rely upon the control's DOM to create the HTML, or translate it, that you'll end up with browser specific behaviour, even if its only nuanced behaviour.

I don't know, because I haven't looked, but if the HTML is created by CityDesk rather than the DOM you'll likely get these issues.

Since I'm knee deep in a similar swamp at the moment I may just be translating everything into my domain though :-).

I disagree with your view that the site not working in some browsers should be pushed back to the user. That seems to be precisely the sort of problem a tool like citidesk should solve. To push that back to the user seems exactly what you're complaining about: that the tool doesn't allow the user to do their job, and instead requires them to care about things that have no relation to what they're trying to do.

Ditto for moving the tags, and closing the image tag. By making the markup valid XHTML, it's more likely the site will continue to work in the future. You tradeoff a small amount of size for that. In return for that small amount of size, the tool removes a problem for you to worry about.

If you want to worry about the details, why use a WYSIWYG tool like citidesk? Why not just edit your pages with a text editor?

Sum Dum Gai
Friday, August 29, 2003

Yes the tool should let people concentrate on their goal instead of the means, and Citydesk does the right thing by making sure it produces valid HTML.
But, a tool should also not touch efforts made explicitly by its users. This means, not reformatting manually formatted, or changed HTML. And certainly not without asking.

If I understand the issues raised here correctly, Citydesk should:
1. Produce valid HTML from the WYSIWYG editor.
2. Leave changes made in the HTML alone, but:
2a. Issue a warning if these changes violate rules that Citydesk is aware of.
2b. Offer a means to correct these violations in agreement with the user.

This should make everyone happy and retains ease of use for the novice, while giving more power to the expert.

Practical Geezer
Friday, August 29, 2003

I agree with the geezer.
If you allow the user access to the formatting code, then you DON'T FUCK WITH IT. Basically, the first time the user switches to HTML view and makes a change, then the app should change from "autoformat" to "error check" mode. (This setting should be visible and easily accessible)

Philo

Philo
Friday, August 29, 2003

Geezer/Philo:

Suppose the user changes
<p>a</p>
to:

<p>a</a>

Not valid XHTML, but in your algorithm CityDesk should "warn and preserve." Fine.

Then you go into WYSIWYG mode. We parse this into a DOM. The </a> is forgiven by the parser and the unclosed <p> is forgiven by the parser. We get a document tree - a data structure - containing a one paragraph with the letter a in it.

Now you change "a" to "b" and save changes. We have to write HTML to disk. What we have in the document tree is <p>b</p>. How do we reproduce the broken HTML you input to us in the first place? The parser ignored the </a> so it's long gone. I can't put it back. The parser didn't care whether you closed all your <p>'s, so I have to make a decision about whether to close <p>s or not.

The trouble is that WYSIWYG editors work on in memory trees which are abstract representations of your original HTML. They do NOT work on HTML directly: that would be too painful. So you always have to regenerate some kind of HTML from the in-memory abstract tree. This HTML always differs a bit from what you put in. It did in CD 1 and it does in CD 2. The difference is that for standards compliance the HTML we output is now xhtml 1.0 transitional instead of html 4.01 as it was in CD 1.

Joel Spolsky
Friday, August 29, 2003

It drives me crazy that the User Model in CityDesk is that CityDesk is "rewriting their code" or "reformatting their code," when the program model is that CityDesk is simply parsing their code and regenerating source from the parsed tree. This difference between user model and program model makes usability problems, and people accuse me personally of being some kind of strange control freak with a need to impose my own opinions of good HTML on their perfectly acceptable code. As if I'm going out of my way to "rewrite their code."

Writing a WYSIWYG HTML editor which actually preserves arbitrary syntax errors in HTML is a monumental task. The only one I've seen that does anything like a decent job of it is Dreamweaver, and they control every aspect of the editing process. We're somewhat limited because we choose to use IE to control most of the editing process: both because it's far more WYSIWYG than you would get with Dreamweaver, and because it saved us zillions of developer years. Dreamweaver lives or dies based on the quality of the HTML editor. We're an automation/content management/templating system with a decent HTML editor that's designed for word processor type users, not people who know or care about HTML. Most of our users are happy to trust the HTML we write for them and rarely look under the covers, and that's our target audience.

Joel Spolsky
Friday, August 29, 2003

A software developer is always playing balancing games. You have to balance what the user wants against the cost of implementing it. When you have a user base, this gets even harder, because you get a disparate collection of wants, some of which will even be contradictory. And you have to keep balancing the wish/bug list against resources.

The danger, methinks, is getting entrenched. You start with a few early comments about a feature and you do the "understood, but we can't because [x]" Then a few more complaints. And a few more. Soon there's a low roar of "I don't like your product because of [x]" and you're just tuning it out because you've established your beachhead. Had you been pummeled from release day with the roar, you would've changed tack immediately; but since the water only went up a degree at a time, you're dug in and convinced you're right.

I'll be interested to see if CityDesk's reformatting becomes a serious issue. To a large degree it depends on the level of sophistication of their average user - if 90% of their users never even use HTML mode, it's a nonissue. And I think this is the wrong forum to pronounce judgement. [grin]

Best wishes on the new release, Joel!

Philo

Philo
Friday, August 29, 2003

Philo, you're right about beachheads, that's why went to xhtml in the first place.

Also if you examine the difference between 1.0 and 2.0 you'll discover that we've gone to considerable lengths to preserve whitespace formatting which we didn't use to preserve at all. CD 1.0 always reformatted paragraphs to put them all on one line and removed any unnecessary whitespace. We now actually go to extreme lengths to replace whitespace with private tags before editing them and then replace these private tags with the original whitespace after editing, which preserves a lot of formatting and indenting you may have done.

We could do even more to preserve whatever wacky things you may have done to your HTML. We could remember the way you capitalized every tag everywhere and reproduce it on the way out (like dreamweaver). But that's just another feature, and, like any other feature, has to compete against other features in the bang-for-the-buck battle for limited resources. In user forums and such you always get arguments about "wouldn't X be nice" but in development labs the arguments are always "which is more important, X or Y?"

Joel Spolsky
Friday, August 29, 2003

the efforts to preserve whitespace might be a way out. If the user-edited html parts that were done in the code view were replaced or marked with special tags so that they become read-only in the wysiwyg view, then people who get off hand-editing html will at least understand that they can't have both worlds?

naw, lots of work :'(

i like i
Friday, August 29, 2003

All tools which allow you to have a "code" view and a "design" view whether they're web page editors or windows forms editors have some trade offs. Over time different tools have taken different tracks to handle this. Block a user from editing a certain area, warn users about editing a certain area, hide certain designer stuff in another file etc.

If you look at the various forms designers for Visual Studio .NET, VB6, Delphi etc there is always a chance that you will lose changes when switching between the 2 views. The reason for this is that it's incredibly hard to keep these 2 views in sync.

The problem is a user doesn't care if something is incredibly hard. They just say "Hey you overwrote what I did!" and say the tool sucks.

chris
Friday, August 29, 2003

I think the issue here, Joel, is one of those 'your internals are showing' kind of things. When I deliberately go in to edit the HTML, and then you don't preserve the HTML that I put in, there is a serious UI problem. What you are saying is that the issue isn't that you are deliberately re-formatting the HTML, but that you turn the HTML into your internal data structure and then back again. The problem is that you've done data conversions when the user has no reason to suspect such a thing is going to happen. You've violated the principle of least surprise - and anytime you surprise a user, you've got a problem.

Other than finding some way to represent the user's screwy HTML in your internal data structure, I think you only have one good way out: If the user edited the HTML, keep the user's HTML until he edits the thing in the WYSIWYG editor. Because right now, you're fibbing to the user. You showed him HTML, you let him edit it, and then you generated something else out the back end. The minute he edits the article in the WYSIWYG pane he should no longer be surprised that you have re-generated the HTML, but until then the user has a right to have his wierdness left alone.

I know, it's hard. Probably very difficult. But unless you can find a way to keep all the user's wierdness intact in your parse tree (and I doubt that's worth the trouble) you are ALWAYS going to have trouble with this. It's going to be a thorn in the side of some of your users for years and years. (Probably not for me though - I just use the WYSIWYG pane).

The alternative: Eliminate the HTML pane. It's not what the users think it is anyway.

Michael Kohne
Friday, August 29, 2003

"The problem is a user doesn't care if something is incredibly hard. They just say "Hey you overwrote what I did!" and say the tool sucks."

Yet by the same token developers often take the lazy route out, or become unexplainably attached to a specific implementation, and justify everything their app doesn't do as "the alternatives are too hard...no one can do something crazy like that!". Of course then a competitor comes out, doing the alternative, and their cake is eaten. I'm making no comment about CityDesk--never used it--but I've long become wary of claims that alternatives aren't possible when the reality is often "the way we decided to implement things, not really taking into account problems like this, we don't want to change...so therefore it's impossible". The user quite simply doesn't care if you're using a memory tree DOM structure to hold the parsed data--they just want to do something.

Dennis Forbes
Friday, August 29, 2003

Joel,

Did I buy CD for XHTML compliance? Like I care!

It's the scripting, it's the fields attached to the articles, it's not having any server side processing.

The html view & normal view are courtesies in your program. I can't do any real editing, especially table creation and maintenance.

I'm in FrontPage all the time. Shitty editing in CD is my tradeoff for exceptional scripting/fields/non ss processing.

So stay with your strengths. Spend your time on building a feature rich language.

Bottom line is: if CD2 were not trying to be XHTML-compliant, about the same people that are beating it right now for messing with their input would complain about its lack of standards compliance next year.

Johnny Bravo
Friday, August 29, 2003

I'm curious - we've got some hot cutting edge developers here - who is concerned about XHTML compliance and why? Because it's not even on my radar... (but then, I'm still in EDIland, circa 1963)

I'm over the moon about all the work that's gone into the XHTML feature in CityDesk (sic) 2.0 but then I don't care about older browsers. Being able to generate standards-compliant markup without having to think about it is a big win in my eyes.

John Topley (www.johntopley.com)
Friday, August 29, 2003

Uh, WHY is the standard important? Standards are good when they serve a purpose, but following the standard because it's the standard is pretty silly.

XML - gives us structured text data storage
HTML - gives us a consistent method of presenting information across disparate applications
CSS - gives us a way to separate presentation from the data (to some degree)

You get the idea.

XHTML gives us...?

Philo

Philo
Friday, August 29, 2003

XHTML+CSS gives you adequate presentation of structured data which can easily downgrade for older browsers, while being still solid in the next 5 years.

Furthermore, although HTML is derived from SGML, it's not well-formed, so this is fixed in XHTML. XHTML also cleans up the separation between meaning and layout. No more [font]-tags and paraphernalia. You just structure your documents logically, and apply representational stylesheets to generate the actual layout.

Johnny Bravo
Friday, August 29, 2003

While I strive to make things xhtml compliant where it's an option and at little cost, Philo is right on the money: xhtml really _doesn't_matter_ in the vast majority of cases (very few sites serve XML that they transform into XHTML, or use XML tools to build HTML) and buys one very little other than a sense of puritanical righteousness. HTML _is_ a standard, so saying that XHTML makes it standards compliant is ridiculous - It's already a heavily followed standard.

Actually this points out something that Joel has mentioned: XHTML is good from a "web editor" developer prespective because it lets them using XML components and tools to hold and realize the node tree...but we all know that making choices that are best for the developer isn't always a good market choice.

Dennis Forbes
Friday, August 29, 2003

I cannot agree. If CD2 or any other comparable product were to deliver page code with content like this:

<html><p>One paragraph.<p>And another one.</body>

which just happens to render in most browsers, then it's still worthless to the market since the delivered code is broken, wrong, invalid. As a customer you'd rely on the grace of a few browser vendors for support of such broken markup. I doubt Mozilla would render anything from the above example.

From a developer's view, writing tools to support HTML 4.x is a major pain in the eyes due to heavy ambiguities within the "standard" itself. So XHTML is also cheaper because it's already supported by a vast amount of tools out there.

b) Seems like 'good HTML' should be a user-level option, on by default. As someone here mentions, if a user types something in the WYSIWYG editor, then it's ok to mess with their source code (hopefully in a minimal possible way, but that can always be improved in later versions. actually you should be able to start with something decent if the HTML is really tree-structured.) But if they just view it, try not to mess with it.

Then again, I use visual studio to reformat XML by switching into its 'visual' editor. This can mess up files, because it doesn't pay attention to the space:preserve tag.

mb
Friday, August 29, 2003

XHTML is a first step towards the Semantic Web because it loses all the legacy presentation-oriented crap (such as the font tag) that has come to plague HTML.

For more information on the promise of the Semantic Web, read "Weaving the Web" by Tim Berners-Lee or see http://www.w3.org/2001/sw/

John Topley (www.johntopley.com)
Friday, August 29, 2003

btw, the article i referenced above is about more than xhtml.

and from what i've seen the xhtml2 people are full of it. or themselves--they definately are of the 'impose my rules on you' model.

mb
Friday, August 29, 2003

The W3C HTML validator has bugs. Why? Because so much of the spec is in english.

XHTML is XML. It can be validated against a schema or DTD. This is empirical. There is no room for interpretation. (though there can still be bugs in the schema or DTD, of course).

XHTML can be processed by XML processors. This is perfect for content managment systems (like CityDesk for instance). It saves code because you can use pre-existing libraries. Less code = less bugs.. Therefore, CityDesk 2.0 probably has fewer bugs because of the small sacrifice of using XHTML instead of HTML.

Richard Ponton
Friday, August 29, 2003

This is the "users voting on features" thread all over again.

Except that as we are all developers or managers of developers or... we all think our votes are extra important.

My vote: You get flamed no matter what you do. Might as well get flamed for following the standards, at least then you are moving forward.

Robert Moir
Friday, August 29, 2003

"XHTML is a first step towards the Semantic Web because it loses all the legacy presentation-oriented crap (such as the font tag) that has come to plague HTML."

One could still just as easily add a style="" attribute to every tag and achieve the same tight coupling of style and content.

Again, XHTML's benefit is that it makes HTML XML conformant (which means nothing to the end user, but as mentioned it makes writing HTML editing tools easier...hurrah for the programmers!)...we had CSS and other decoupling for years before XHTML hit the scene.

Dennis Forbes
Friday, August 29, 2003

Actually making pages XML compliant *does* make a difference to end users. Because the HTML "standard" is ambiguous and browsers each support their own share of "broken" HTML, having XML compliant web pages makes it more likely that the pages will display correctly in any XHTML compliant browser. By producing compliant XHTML, CitiDesk can make much stronger statements about the code being displayed correctly in a variety of web browsers.

Mike McNertney
Friday, August 29, 2003

Hmm, some points...

The comment that "the W3C HTML Validator is buggy" is misleading. Pre-XHTML specs (HTML 2.0, 3.2, 4.0...) are all associated with a DTD, and the validator performs SGML validation against it. Now, a close reading of the SGML specification suggests that validation encompasses both the technical (machine-validatable) part of the DTD and the accompanying prose, hence the ambiguity. On the other hand, those constraints which are imposed in the prose of HTML can't bind XML at all, so I'm skeptical as to how much is really gained by switching to XML validation. (What switching to XML does do is make it easier to enforce well-formedness, which produces considerably cleaner code; this can also be done with SGML tools, but it's probably easier just to use XML.)

Joel's comments about broken tags being rearranged/dropped by the parser are spot on. More to the point, this is the *exact same process* that occurs in the HTML parser of your browser. See http://ln.hixie.ch/?start=1037910467&count=1 for an account of this process in different browsers; for instance, if I remember the details of residual style handling correctly, Mozilla will probably transform the markup

<small>
<p>a</p>
<p>b</p>
</small>

into

<p><small>a</small></p>
<p><small>b</small></p>

as can be discovered by using, for instance, Mozilla DOM Inspector to look at the DOM in the tree view.

In other words, the CityDesk parser is doing more or less the same thing that the "fix-up" part of your browser's parser does when it reads the page. If your markup is getting rewritten by the CityDesk parser, it is almost certainly not being interpreted by browsers as you wrote it. Given that the "fix-up" part of the parser is basically an conglomerate of ill-understood hacks that's evolved over time to accomodate the most common forms of incorrect markup (at least in Mozilla; I have little confidence that it's better in any other browser), trusting in it to reliably correct your markup in perpetuity strikes me as a dangerous exercise. You're better off accepting the corrections and adjusting your HTML coding habits to write markup that's straightforwardly understood by the browser without rearrangement.

Chris Hoess
Friday, August 29, 2003

"If the user edited the HTML, keep the user's HTML until he edits the thing in the WYSIWYG editor. Because right now, you're fibbing to the user. You showed him HTML, you let him edit it, and then you generated something else out the back end. The minute he edits the article in the WYSIWYG pane he should no longer be surprised that you have re-generated the HTML, but until then the user has a right to have his wierdness left alone."

I agree.

"Most of our users are happy to trust the HTML we write for them and rarely look under the covers, and that's our target audience."

Oh, I'm a marginal user because I spend LOTS of time editing html directly. I thought the html view was for people like me, who I thought was your target audience. I presume that someone who is afraid of html won't be using your scripting either.

Why didn't you spend time on a true enterprise edition?

Why didn't you spend time on a built-in breadcrumb variable?

Why didn't you spend time on a function that lets us evaluate folders in (thisFolder) and subdirs of (thisFolder)?

Why didn't you spend time on creating true if..then..else statements beyond IF NONBLANK?

If a user really cares about XHTML, they're using another tool in conjunction with CD, just like I use FrontPage in conjunction with CD.

I don't think I'm a marginal user because I code in html directly. You shouldn't think so either.

Joel, be responsive to us as you've shown to be in the past. Steven Denbeste is your first case study in someone who won't take your free upgrade, and probably won't buy your program again. You may be technically right, but that don't pay your overhead. Let the user decide what "compliance" they want. I want my html left alone, just like you did in the early v2 betas.

I'd suggest that this would be because you'd still be waiting for version 1, rather than discussing version 2, if he had.

Sum Dum Gai
Saturday, August 30, 2003

I meant...

Valuable time was spent to build a feature that is outside the strength and core of the program.

Bob Bloom
Saturday, August 30, 2003

I think Joel made the right decision. You can't span paragraphs with semantic/formatting tags and you never have been able to. It's always been an error to do so. So if you try, Joel's "FogCreek Exclusive Auto-Correctifier, Patent Pending" brilliantly fixes your errors automatically. So some people who don't understand the standard and/or don't value web pages that render in all browsers are unhappy. Who cares. Let them use vi. The last thing FC needs is for people to be visiting web sites that sport the "Made with CityDesk" glyph which don't render properly in Mozilla or Opera because they are full of erroneously formated code.

Tony Chang
Monday, September 1, 2003

"I meant...

Valuable time was spent to build a feature that is outside the strength and core of the program."

The point is, by sticking to XHTML (which is XML), CD can make a lot more assumptions about the code. It will be well-formed. Every tag has an end tag (effectively). Opening and closing tags will not overlap.

These assumptions make it *easier* to write CD. They reduce complexity. Valuable time was *saved* by using XHTML.

Now, he has to deal with the HTML editor and the fact that users might not enter valid XHTML. Now, he could

a) remove the HTML editor completely
b) beat the user over the head with error messages until they enter valid XHTML
c) do what the web browsers do, which is convert their invalid HTML into valid XHTML