Home

Wednesday, Apr 23, 2008, 18:23

Hi, I’m Aristotle Pagaltzis and this is my weblog. You will find the 9 most recent articles below, all 427 of them in the archive, and new ones in the newsfeed.

It’s a topsy-turvy world

Bertelsmann has announced that they’ll print a single-volume excerpt of the German Wikipedia [German press release].

Meanwhile, Encyclopædia Britannica are now giving away web access to their content.

— Wednesday, Apr 23, 2008

Filed under “unclear on the concept”

Slim Amamou:

REST imposes too many constraints.

(In comments to an excellent note on REST by Subbu.)

— Wednesday, Apr 23, 2008

That looks about right

$ history|awk '{print $2}'|sort|uniq -c|sort -rn|head
    424 cd
    263 rm
    243 mplayer
    241 git
    179 sudo
    155 pod
    142 ack
    141 perl
    101 ll
    100 mv

(Yes, I am participating in another meme – all the while not writing much else. Sorry, sorry.)

— Thursday, Apr 10, 2008

Via Jakub Steiner: Interesting Statistics.

No credit where no credit is due

John Gruber:

I do think the IE team deserves credit for having floating the idea for opt-in version targeting rather than just going ahead and implementing it.

Err, “floated the idea?” I thought what I read was an announcement of fait accompli. At no time did it strike me as though Microsoft left the issue open-ended. That they subsequently revoked their proclaimed decision came out of left field; would this have been the case if what they first did was in fact merely “floating the idea?”

I’m glad that someone inside Microsoft apparently somehow managed to overrule someone else (whoever the people involved are), but I can find no way to interpret Microsoft’s initial course of action as commendable.

Furthermore, the question Eric Meyer asked is still open: even though Microsoft reneged on the most objectionable part of the announcement, the fact that IE 8 will have three rendering modes, including a frozen-in-time one with all its implications for competitors, still stands.

— Wednesday, Mar 5, 2008

You know what’s tough?

Forget Markup Barbie… I want Unicode Barbie. When you pull her string, she says “text is hard.”

— Monday, Feb 25, 2008

Roy T. Fielding has a posse

… err, I mean, has a weblog.

Welcome aboard, Roy.

— Sunday, Feb 17, 2008

Continuous partial debate

Recently there has been a bit of a spat over a proposal for doing RESTful partial updates by Joe Gregorio. He bills it as a how-to, but considering that he closes with a list of open issues, I think that’s a bit of a misnomer. Rob Sayre criticises it as not RESTful, but I don’t see how it’s not: the URI is constructed via a server-published template and points to a GETtable synthesised resource, which is perfectly fine. I see the problem elsewhere, but let me get there by answering Joe’s questions and extrapolating from where that takes me.

  1. Do you have to include the xml:id attributes when you PUT back an update?

    Yes. Otherwise, if there are several editable elements with identical local-names in a row, how does the server tell which is which?

  2. Do the xml:id attributes appear when you do a GET on such a resource?

    Ditto.

  3. Obviously the representation of a partial update resource is not a valid Atom Entry. What should be the mime-type of that resource?

    The document element should not be atom:entry. This is not an Atom Entry Document; it’s merely a bag for a bunch of fragments with identifiers. Make it t:partial or something like that.

  4. There are undoubtedly XML parsers that will choke on xml:id attributes even though according to the XML specification the xml qname is reserved and should always be defined. Are these problems widespread enough to kill the use of xml:id and warrant the creation of an id attribute in another namespace?

    There may be xml:id attributes present for purposes other than partial update. Overloading xml:id leaves the client with no idea about which elements are editable and which ones are merely fragment-addressable. Use t:edit-id instead of xml:id.

  5. Can t:link_template elements use the same IANA Atom Link Relation Registry or do they need their own registry, or do we just hold our noses and put the URI Template in an atom:link element? Obviously the set of t:link_template relations is a super-set of atom:link relations. The same problem also exists for using URI Templates in HTML link elements.

    No templates in atom:link. Applications expect URIs there (and specifically derefencable ones, for the most part). IANA Atom Link Relation Registry is fine; whether a client gets http://example.org/foo from an atom:link or constructs it via http://example.org/{id} in a t:link-template should be irrelevant.

  6. How do you handle descendents that aren’t children of the document element?

    There is no point in preserving the original document’s structure if you only want to ship fragments of it and all of the fragments are identified without reliance on the document’s structure. So you make no attempt to preserve the nesting of the original document, and just put all the identified fragments directly under the document element.

Once you take these steps (particularly #6 and #3 in combination), it becomes apparent that what you have here is a really no more than a home-grown patch format, just as Dare Obasanjo commented. The only uncommon feature of this proposal is the one Joe himself pointed out: it puts the server in control of what portions of the original document it is willing to allow partial updates on.

I don’t know if that’s a good idea; Dare thinks it’s not, and I’m inclined to agree.

But if you think it is, you could just as well go all the way: first, give this patch document format a MIME type. Then define a t:delete element to put under the t:partial document element, where it can appear any number of times, and whose content is the t:edit-id of an element to be deleted. Thus, the URI of the synthetic partial resource becomes unnecessary to interpret the patch document correctly. The bottom line is that you can stuff such a document into a PATCH request to the Atompub edit URI, obviating the URI construction gymnastics in entirety.

In essence, this is what Rob proposed, except for retaining the original proposal’s property of putting the server in control.

— Sunday, Feb 17, 2008

Semantic duct tape

James Gottlieb:

So far, I have the following as a way to build a server (subject to change, of course):

package My::Server;

use RDF::Server;

with 'MooseX::Daemonize';
with 'MooseX::SimpleConfig';
with 'MooseX::Getopt';

interface 'REST';
protocol 'HTTP';
style 'Atom';

— Monday, Feb 11, 2008

Paul Graham’s kind of dirty

Arc is finally released as a (by the sound of it) wildly unfinished snapshot. In his notes about the decision, Paul expounds on the rationale for his design decisions to do things like skip Unicode support 1  and write HTML libraries that output presentational tables. Let’s take two quotations and put them next to each other.

One is always a bit sheepish about writing quick and dirty programs. And yet some, if not most, of the best programs began that way. And some, if not most, of the most spectacular failures in software have been perpetrated by people trying to do the opposite.

So experience suggests we should embrace dirtiness. Or at least some forms of it; in other ways, the best quick-and-dirty programs are usually quite clean. Which kind of dirtiness is bad and which is good? The best kind of quick and dirty programs seem to be ones that are mathematically elegant, but missing features – […]

Arc tries to be a language that’s dirty in the right ways. […] The kind of dirtiness Arc seeks to avoid is verbose, repetitive source code. The way you avoid that is not by forbidding programmers to write it, but by making it easy to write code that’s compact. One of the things I did while I was writing Arc was to comb through applications asking: what can I do to the language to make this shorter?

Clearly, the way to make programs written in the language shorter is to force them to deal with Unicode on their own.

He does make an effort to poison the well by saying that people who would care about these things probably wouldn’t like Arc much to begin with. Maybe the fact that I happen to speak all of Greek, German and English means that I shouldn’t have an interest in Arc, then.

Note I’m not saying it’s necessary for the first unfinished cut of a language to have Unicode support; but Paul seems to go rather beyond saying that. Note further than I don’t care one whit about his silly claims that presentational markup is the right kind of dirtiness: that can always be fixed by libraries. Character strings, however, are something that you really do need to get right at the core language level. You cannot leave strings for the libraries to fix. If you think that that’s a viable route, I have a bridge to sell you. And it’s written in C++.

Consider next that he mentions up front that it took Guido van Rossum the entire last year to rework Python’s character string support because of back-compatibility issues. (Other languages have had similar experiences.)

Leaving Unicode support in a language “for later” means you will spend a huge chunk of time sometime in the future to put it into the language – or you won’t, and then programs written in that language will forever be verbose when dealing with strings.

The right kind of dirty?

Footnotes:

  1. He says it supports only ASCII; I think he means octets instead, but I doubt he cares one way or another.

— Wednesday, Jan 30, 2008