Archives » August 18th, 2003

August 18, 2003

They still don’t get it.

Christopher Lydon is a fellow at the Berkman Center for Internet & Society at Harvard Law School. In the last month he has used his radio background to do a series of interviews with influential and creative bloggers, and then made them available exclusively on the Web as MP3s. I love his interviews, as they’ve touched on a few topics important to the world of the web, as well as reminding me that there are many people out there, of whom I will be eternally jealous, that can both write and talk. Chris Lydon seems to be fairly well clued in to what’s going on out there in cyberspace, and I think everyone should check out his interviews.

But his website is a mess.

Not visually, mind you. Just by looking at it, you’d never know. It’s a simple two-column layout with a clean look. But when you get under the hood, you see a mess of markup that’s riddled with font tags, redundant classes, and every possible abuse of HTML. But at the same time it also makes generous use of CSS, even though most of that is possibly redundant too. Dave Winer over the weekend pointed out that the page doesn’t look right in Safari, and he was wondering why. He later posted a link to the W3C validator’s assessment of the page, and it comes up with 65 errors, most of them extremely simple to fix. I’ve been noodling around in HTML enough to know that the first step in debugging a page is to make sure the HTML validates. If your document doesn’t follow even the basic rules of HTML, there’s no telling what kind of unpredictable behavior you’re going to get when a browser tries to chew through it. That’s why tag soup is such a bad idea; you’re relying on browser quirks and screwy rendering to display your page. You’re counting on the browser screwing things up, so you build it already screwed up, hoping to keep one step ahead of the rendering engine, so that everything can be screwed up together and somehow work out right in the end. Meanwhile you’ve created the ugliest markup imaginable, and by doing so increased the chances that your carefully super-glued-together page will fall apart in another browser. Writing to the quirks and inconsistencies of what the web used to be is never a good idea. If you write a site for IE, it will invariably fall apart in Mozilla. And vice versa. If you write to one universal standard that everyone supports, it will work everywhere. This is what the Web Standards Project has been fighting for. They’ve convinced the browser makers, they’re making headway with the tool vendors, but there are still legions of authors and developers out there, people writng their own sites or making templates for Blogger and Radio and Moveable Type, that don’t get it yet. They are the ones that seemingly refuse to accept that the landscape of the Web has changed, that markup matters, and that it’s fine to write crappy markup when you’re first starting out, but by the time you’ve gotten to the point where you’re creating tools and templates you need to know all of this. There is no excuse anymore.

It’s especially frustrating when you see people who are so devout in proclaiming that [fill-in-the-blank technology] must be written properly, and then they turn around and write some of the worst tag soup out there. I’m specifically going to point out Dave Winer here, because he was involved in creating Chris Lydon’s site, and he has been vocal about how he doesn’t want to change his tag soup ways. And no, this is not a personality battle. That seems to be Dave’s favorite way of discrediting the views of his opponents. I am looking at this from a professional standpoint, and I hope to make a reasoned argument. I wouldn’t even call myself an “opponent” of Dave’s. I just don’t see why he refuses to learn more about HTML. He has, in recent months, called out many people and organizations for their use of “funky” RSS. He feels strongly about this, because he had a hand in creating RSS, and he feels that the simplicity is one of its best features. Keep all the extra cruft out of RSS, he says, and the world will be a better place. Aggregators will run better, and bandwidth will be used more efficiently. I agree with him on those points. RSS is a simple technology. No need to bulk it up. But I can’t understand why he doesn’t view HTML that way. HTML is an equally simple technology (most problems are created by browser bugs where CSS is concerned). If you look at the source of a well-written website, you will see simple elements being used, a minimum of attributes, and something that is very human readable. But many people, Dave included, are still writing HTML like it was written 9 years ago. Messy, bloated, and very, extremely “funky”, by his standard. “Funk”, according to Dave, is when you use the wrong technology for the job. Specifically, he says it’s when you use a namespace instead of a native RSS element. Let me extend that definition to say that “funk” is when you use a deprecated or otherwise incorrect-for-the-job HTML element to do something that should be handled by CSS. Using <blockquote> to indent? Funky. Giving the same class to a whole group of elements, when they all share the same containing element? Funky. Not closing every tag? Funky. Using <font>? Fun-Kay. HTML has finally reached a stable point where the syntax is simple and all browsers render a page the same. But these people are still partying like it’s 1996. And they’re the ones that are supposed to know better.

We are entering an age when it is less and less necessary for ordinary people to know HTML. Weblog tools and CMSes are gaining popularity, and a larger number of sites are being built using them. But as the need for regular people to know HTML goes down, it goes up for the rest of us. We are at a time where, if you’re generating HTML in a professional capacity, there is no excuse for you not to know how it works. It’s part of the job. If you’re a tool vendor or a template creator, the need is even greater, because you’re creating the markup that’s going to go on other people’s sites. You have a responsibility to give them the best markup possible. You must be an HTML aficionado. You have to read and understand the specs coming out of the W3C. Anything less is amateurish and unprofessional. This doesn’t mean that 100% validation is strictly necessary. Leaving alt out of images may be an accessbility sin, but it’s forgivable. And everyone has been caught with unescaped ampersands in their URLs. Those kinds of things are practically unavoidable, and when they happen you’ll have an invalid page. But they are minor offenses. The problem is when the people working on huge site-generating tools like Radio, and creating templates for the same, think that using <blockquote> to indent your whole page is okay, and that using CSS combined with font tags, as well as using four &nbsp;s to indent paragraphs, is an acceptible practice. That is not acceptible. That’s like having an engineer at Ford not care if a couple of spark plug wires are swapped when a new car rolls off the line. “Well, the car runs, doesn’t it?”

For shame. Writing proper HTML is easy. You can learn it in a couple of hours. Unlearning the old ways of doing things might be a little harder, but we’re still talking about a matter of a couple of days. And the payoff is enormous. Crappy HTML should be relegated to the realm of the amateur. But in fact, it seems that amateurs are the only ones capable of writing good markup and using CSS correctly. How did things get so reversed? Why is the web so exactly backwards from what makes sense? Anybody who is generating tag soup in a professional capacity needs to reevaluate their priorities, and understand that we are now in the 21st century. The web has changed in the last 9 years. The only thing holding us up is you.