- ignore the comma, point and space as delimiter and compare the values / entities against a dictionary for neighbouring words.
- don't ignore the comma and compare the values against a dictionary.
Put a priority ( or in machine learning terms: a classifier) on both outcomes, because comma is not reliable in spoken language. So that it would interprit peanuts butter as [peanuts, butter] and peanut butter as [peanut butter].
PS. Now i hope that text-to-speech translates spoken: peanuts correctly to [peanuts] and not [peanut], because that would fail.
PS2. The article itselve doesn't mention the punctuation problem
>PS2. The article itselve doesn't mention the punctuation problem
It doesn't go into detail but it does seem to mention it.
>Off-the-shelf broad parsers are intended to detect coordination structures, but they are often trained on written text with correct punctuation. Automatic speech recognition (ASR) outputs, by contrast, often lack punctuation, and spoken language has different syntactic patterns than written language.
- ignore the comma, point and space as delimiter and compare the values / entities against a dictionary for neighbouring words.
- don't ignore the comma and compare the values against a dictionary.
Put a priority ( or in machine learning terms: a classifier) on both outcomes, because comma is not reliable in spoken language. So that it would interprit peanuts butter as [peanuts, butter] and peanut butter as [peanut butter].
PS. Now i hope that text-to-speech translates spoken: peanuts correctly to [peanuts] and not [peanut], because that would fail.
PS2. The article itselve doesn't mention the punctuation problem