Skip to main content

How not to do a Turing test

The idiot news people (sorry BBC) made a splash recently by claiming that a computer had successfully passed the Turing test. This is an idea modified from an article by the great Alan Turing, who suggested that a good test of artificial intelligence is to communicate with the computer down the wire and if you can't tell whether it's human or computer, then it passes - you have AI. (Turing's original concept is actually significantly more confusing, but this is the version usually given.)

There are several problems with this story. One is that the test as described is far too easy to pass. All that is required is that the machine is 'mistaken for a human more than 30% of the time during a series of five minute keyboard conversations.' That's a pretty low pass rate. I don't think you even get a GCSE for a 30% success. (Turing actually asked 'Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman?' - in his version the computer was rather oddly compared with two human, one who always tells the truth and one who doesn't.) Apparently the chatbot 'Eugene Goostman' convinced 33% of the judges that it was human, therefore the organisers claim a success. What we aren't told is how many judges there are - if there were only 3, we are talking about one person being convinced - hardly impressive.

Frankly this test at Reading University was a farce, and it's very sad that some academics enthused over the success the way they did (and that the Princeton web page portrayed above seems to think it was a success). The fact is, you only had to spend the requisite five minutes with Eugene and you would know perfectly well that 'he' is a program. Interestingly, the team behind Goostman have taken the chatbot down, probably due to the hilarity of those trying it out and discovering just how bad it is. (It should be here.) But even the simplest of tactics - asking the bot what a word the bot itself had used meant - showed it up as a failure. It couldn't explain the meaning of a single word it used. A 13-year-old Ukrainian speaking a second language, as Eugene is supposed to be, might not give good dictionary definitions, but could and would have a stab at this. And there are lots of other easy conversational ploys that the chatbot failed on.

If you would like a go at a very early chatbot that was also briefly claimed to have passed the Turing test, take a look at the grandmother of them all, Eliza.

I leave the final word with Dean Burnett, who claimed in a Guardian article that an actual 13-year-old  boy had passed the Turing test and was declared human. Now that, surely, is impossible.

Comments

Popular posts from this blog

Why I hate opera

If I'm honest, the title of this post is an exaggeration to make a point. I don't really hate opera. There are a couple of operas - notably Monteverdi's Incoranazione di Poppea and Purcell's Dido & Aeneas - that I quite like. But what I do find truly sickening is the reverence with which opera is treated, as if it were some particularly great art form. Nowhere was this more obvious than in ITV's recent gut-wrenchingly awful series Pop Star to Opera Star , where the likes of Alan Tichmarsh treated the real opera singers as if they were fragile pieces on Antiques Roadshow, and the music as if it were a gift of the gods. In my opinion - and I know not everyone agrees - opera is: Mediocre music Melodramatic plots Amateurishly hammy acting A forced and unpleasant singing style Ridiculously over-supported by public funds I won't even bother to go into any detail on the plots and the acting - this is just self-evident. But the other aspects need some ex

Is 5x3 the same as 3x5?

The Internet has gone mildly bonkers over a child in America who was marked down in a test because when asked to work out 5x3 by repeated addition he/she used 5+5+5 instead of 3+3+3+3+3. Those who support the teacher say that 5x3 means 'five lots of 3' where the complainants say that 'times' is commutative (reversible) so the distinction is meaningless as 5x3 and 3x5 are indistinguishable. It's certainly true that not all mathematical operations are commutative. I think we are all comfortable that 5-3 is not the same as 3-5.  However. This not true of multiplication (of numbers). And so if there is to be any distinction, it has to be in the use of English to interpret the 'x' sign. Unfortunately, even here there is no logical way of coming up with a definitive answer. I suspect most primary school teachers would expands 'times' as 'lots of' as mentioned above. So we get 5 x 3 as '5 lots of 3'. Unfortunately that only wor

Which idiot came up with percentage-based gradient signs

Rant warning: the contents of this post could sound like something produced by UKIP. I wish to make it clear that I do not in any way support or endorse that political party. In fact it gives me the creeps. Once upon a time, the signs for a steep hill on British roads displayed the gradient in a simple, easy-to-understand form. If the hill went up, say, one yard for every three yards forward it said '1 in 3'. Then some bureaucrat came along and decided that it would be a good idea to state the slope as a percentage. So now the sign for (say) a 1 in 10 slope says 10% (I think). That 'I think' is because the percentage-based slope is so unnatural. There are two ways we conventionally measure slopes. Either on X/Y coordiates (as in 1 in 4) or using degrees - say at a 15° angle. We don't measure them in percentages. It's easy to visualize a 1 in 3 slope, or a 30 degree angle. Much less obvious what a 33.333 recurring percent slope is. And what's a 100% slope