AI for understanding ambiguity

Please do not park bicycles against these railings as they may be removed - the railings or the bikes? Understanding the meaning is easy for us, harder for machines
Please do not park bicycles against these railings as they may be removed – the railings or the bikes? Understanding the meaning is easy for us, harder for machines

Last year I wrote a post on whether machines could ever think1.  Recently, in addition to all the general chatbot competitions, there has been a new type of test for deeper contextual understanding rather than the dumb and obvious meanings of words.  English2 has a rich variety of meanings of words with the primary as the most common and then secondary and tertiary meanings further down in the dictionary.  It’s probably been a while since you last sat down and read a dictionary, or even used an online one other than to find a synonym, antonym or check your spelling3 but as humans we rely mostly on our vocabulary and context that we’ve picked up from education and experience.

But how do you even begin to program this in to a machine?  Simply providing a dictionary is no better than giving a human a dictionary – you don’t learn phrasing and context.  If you pre-program set phrases then you can suffer from an inability to understand the ambiguous.  As humans, we can usually4 disambiguate sentences, picking the correct meaning in amongst the clause, or even an entire paragraph, and understand to what indefinite articles refer.  Translators have to be excellent at this to pass on the correct meaning in a different language, sometimes lagging a sentence or two behind to ensure context is correct5.  So why is this important?
As our interaction with computing grows and the drive to make this experience as natural as possible also increases, we need machines to understand our idosyncrasies.  Amazon’s Alexa already knows that if we ask:
Alexa, what’s the weather like?
Then we implicitly mean what’s the weather like in my current location now. Machines have always been bad at understanding how to infer missing information and how to understand indefinite descriptors such as ‘it’ and ‘they’ where there are multiple defined objects in the sentence.
The Winograd Schema Challenge seeks to find applications that can solve ambiguous sentences and avoid future recurrences of the “Siri call me an ambulance” problem6.  In this challenge, the ambiguous sentence was given with two options, only one of which was correct.  There is a grand prize of $25k to the first team that can score over 90% in two rounds.  As blogged by Will Knight, the best competitors in the latest competition scored at 48%, only slightly above the score gained by guessing alone (45%).  I find this fascinating – why were the machines so bad?  What can be done to enhance common sense understanding without having to hardcode all of our spoken and written inconsistencies and ambiguities?  Deep learning seems like an obvious approach here, although not all of the entrants used it according to Will.  Maybe because there are still difficulties in training language based networks without large amounts of labelled data.  However, the most likely reason is that the focus of the industry hasn’t been on these sorts of common sense problems yet – there have been other problems to solve first.  Resolving these issues will give the next step in the evolution in the chatbot arena – true understanding of questions and comments.  Similar evolution will be necessary for appropriate responses and return questions, although I believe this will happen in parallel.
Edit:  If you want to know more about the challenge from the perspective of a participant, Don Patrick’s excellent blog of the event is well worth a read.
We need machines to be excellent at this level of understanding to avoid the nightmare of “missing data” and “does not compute”.  When the industry really cracks this, how will we tell whether the responder to our queries is human or not?
  1.   Spoiler, it depends on how you define thinking 😉
  2. As this is the language I know most intimately.
  3. I know it was 12 years for me – I was looking for some really cool spell names for a MUD that I was involved in creating and I believe got as far as ‘D’, with each spell bearing relevance to the meaning of the word…
  4. But not always, and the ability does vary from person to person.  The case of Derek Bentley, who was given a posthumous pardon after being hanged, is an interesting example.  When his accomplice in the burglary was asked to hand over the gun, he was alleged to have said “let him have it” – did he mean “let him have the gun” or more colloquially “shoot him”.  His friend chose the latter and despite Derek himself having learning difficulties, both boys were accused of murder.  Assuming these words were even spoken.  See the Wikipedia entry on the case 
  5.   If you want an example of this in real time, the subtitles on the live news is a good one to watch – these cannot be prepared in advance and the speech recognition dumbly translates what it thinks rather than what is said.
  6. which has since been fixed and used successfully

Published by


Dr Janet is a Molecular Biochemistry graduate from Oxford University with a doctorate in Computational Neuroscience from Sussex. I’m currently studying for a third degree in Mathematics with Open University. During the day, and sometimes out of hours, I work as a Chief Science Officer. You can read all about that on my LinkedIn page.

2 thoughts on “AI for understanding ambiguity”

  1. Nice post, I like the practical bicycle and railings example 🙂
    As one of the contestants, here’s my explanation of my program’s low score (see website). Though truth be told I only ever developed a partial solution because, as you say, there are other problems to solve first that are of greater priority. This contest required a combination of language understanding, reasoning, and all common knowledge in the world, and all three of those are still unsolved problems by themselves.

    The entry that used deep learning scored 58% after the organisers fixed a technicality with the input. I don’t expect this approach to be a full solution either, as deep learning goes by the statistics of word occurrences. As soon as you say something that is statistically uncommon, it’ll still misinterpret. However, since a majority of things we say is literally common, it will probably outdo other approaches for a long while.

    1. Thanks for reading 🙂 Lovely to hear about your involvement – sadly I only found out about this after the event. I think you’re correct in your assessment, and I believe that there will be a better way of doing it than reusing the current techniques. Maybe we need to understand our own decision making skills better 🙂

Comments are closed.