Defining the Semantic of Markup

Semantics in regard to HTML markup is a murky water. This is because web pages are usually not an essay style document, which HTML was designed to markup, and contain information that is not actually relevant to what the page is about. Examples would be: menus, shopping cart information, summaries of forum activity, and the other half of HTML design: user/human interfaces. To say or even think that HTML can encapsulate all the “meanings” that human language structures can come up with (which are actually infinite), not to mention the non-language structures found on web systems representing a computer system interface, is naive. It is also an assumption that has never been backed up by any standards body in argument and thats because its simply wrong. The Microformat standard and now POSH process seem to be unwittingly dealing with the problem without understanding it. This is actually an applied philosophy problem!

Tim Berners-Lee created the original HTML (and associated protocols such as HTTP) on a NeXTcube workstation using the NeXTSTEP development environment. At the time, HTML was not a specification, but a collection of loosely defined elements to solve an immediate problem: the communication and dissemination of ongoing research between Berners-Lee and his colleagues. His solution later combined with the emerging international and public internet to garner worldwide attention.

Don’t get the wrong idea

There is a lot of bombast being paraded around the internet by standards fanatics and “web gurus” about the need for ‘Semantic Markup’, but few actually state the basis from which they define ‘Semantic Markup’. Lets try and correct this then by first having a look at the definition of ‘semantic’ (from the dictionary that comes with OS X):

semantic: relating to meaning in language or logic

and so lets look up ‘meaning’, which gives:

meaning: important or worth while quality; purpose

So what then is the ‘purpose‘ of semantic markup, from a practical point of view: its for machines to read and parse (understand). The machines I’m talking about are browsers that might understand Microformats or more importantly Search Engines.

POSH isn’t

Jeremy Keith is trying to separate Microformats from POSH formats (or create more acronyms), by splitting the class Socrates style. He defines the difference (not very clearly) as Microformats decided by the group and POSH created by individuals. When looking at the actually physical differences, there doesn’t seem to be any, hence I find his argument a bit pretentious.

I think that “enthusiastic developers toil in isolation” is actually what gets things done on the web. Pick any technology on the web, including HMTL as seen above, and you’ll find it usually began with one man (ya, not many women in there) and a few late nights. The technology usually comes before the standard, the necessity before the invention.

Microformats ARE NOT a subset of POSH

Microformats are a subset of meaningful markup

Microformats (a naming convention for the ‘class’ attribute of HTML elements) give meaning to markup, in line with what I have said above, because they give a reason to format the markup in a certain way i.e. some browsers can read the data marked up to a Microformat standard straight off the HTML (still not a great reasons its not very widely implemented yet).

The problem is that by using class names instead pure tag configurations you circumvent the markup completely. It doesn’t matter weather you use a H1 or a P anymore if you give it a class of ‘title’. It might make you ‘feel better’, adhering to the standard, but that was not a problem that need to be solved (well, for some standards geeks it is).

The right things for the wrong reasons

Don’t get me wrong, I believe web standards, Microformats and POSH are good and just, but my reasons are different AND justified based on purpose:

What IS meaningful markup

Up until IE6 there was very little incentive for web publishers to use correct HMTL markup. This was because Browsers would render badly formed HTML just as well, and sometimes better, that valid HTML. It wasn’t until IE6 started rendering valid markup in a way that was more in line with the standards that valid HTML actually had ‘meaning’. It would render the page in a more similar way to what was defined by the standards which meant that you could get your visual design to look roughly the same in most browsers with less work. There was suddenly a reason to use Web Standards for markup, and to this day the only reason.

Semantic markup is good for Google…mostly

The rise of Google as THE number one search engine of the web has given them the power to ‘encourage’ the proper use of HTML again. Note I said HTML not standards. If you are very interested in SEO then you know you should:

  1. Have the important content at the top of the page, as google doesn’t bother with the whole page.
  2. It should be keyword heavy as this first paragraph will come up in search results and will most likely be all that is index by google
  3. Have the H1 and TITLE tags agree, as this boost your ranking, particularly a H1 at the very top of your content.
  4. Make the first 12 links on your page important, as these are the only ones that get looked at.

The ‘meaning’ thus becomes what Google likes to see, and this becomes the key factor in deciding how to markup our documents. Google becomes the audience for the markup (not the people viewing the page). This just so happens to be good for accessibility. What the browser renders, the visual athletic is the meaning to the human audience (if they can see), but this is subject to the variable ‘user agent’ that the view happens to be using.

Microformats

Its not Microformats which gives semantic class names a meaning but the tools that use them. To quote Jeremy Keith:

There is a misconception that the word “microformat” refers to any re-usable piece of markup. That simply isn’t true. Microformats need to also solve problems

Again the purpose behind a technology, the reason which gives meaning is the need to “solve problems”.

A use for the POSH

Finally there is a conceivable reason to use the POSH process (as thats all it seems to be apart from a silly acronym) apart from the feeling your doing the right thing.

One of the most labour intensive and complex web development tasks I find is CSS layout. Getting a site to look or behave the way the client/designer imagines and making it cross browser is tricky and often tedious. Hence in the last couple of years I’m all for the copy and paste approach as far as CSS goes.

POSH would seem to offer a way of publishing “feel good” semantic template HTML and then having a library of CSS templates that go with it each. This could potentially allow whole sites to be constructed from prefabricated parts and perhaps a similar design movement to the Bauhaus movement! Designers could design CSS templates that accommodate the most common POSH markup and they would be applicable across all CMS systems that support the standard.

Conclusion

A lot of the debate on Semantic Markup doesn’t actually have a basis as there is a lot of ambiguity about what the “Semantic” is. Given that the original semantic of HTML was about exchanging scientific research and most of the web in not about that anymore we have to find out for ourselves what is semantic markup and what is not. I think we can find this out by thinking about problem solving first before we think about standards and looking for purpose in order to understand the meaning. From this approch we can see when standards are useful and when they are not and more importantly say why!

18 thoughts on “Defining the Semantic of Markup”

  1. I think that you really can judge people by the way they comment different stuff. Some people, even expressing negative thoughts, are still polite and they respect and understand other people. Some people are not even trying to be nice, they just don’t care. I think self-confident person will always act nice, no matter what other people do

Comments are closed.