I was at a deep learning conference recently. The topic of how AI can improve healthcare came up. One panelist said that a startup they were working with wants to help doctors use AI to use NLP to send claims to insurance companies in a way that won't be rejected. Another panelist said that he was working with another startup that wants to use AI and NLP to help insurance companies reject claims.
I think in the future we'll see their AI fighting against our AI in an arms race similar to the spam wars. The one with the most computing power and biggest dataset will win and humans will be at their mercy.
> One panelist said that a startup they were working with wants to help doctors use AI to use NLP to send claims to insurance companies in a way that won't be rejected. Another panelist said that he was working with another startup that wants to use AI and NLP to help insurance companies reject claims.
But quite possibly "greater efficiency" according to a fitness function that's not accurately mapped onto "keeping humans alive"...
I wonder if this'll end up in an equivalent state to the "tank detection neural net" which learned with 100% accuracy that the researchers/trainers had always taken pictures of tanks on cloudy days and pictures without tanks on sunny days? ( https://www.jefftk.com/p/detecting-tanks )
Who'd bet against the doctor/insurer neural net training ending up approving all procedures where, say, the doctor ends up with a kickback from a drug company - instead of optimising for maximum human health benefit?
>But quite possibly "greater efficiency" according to a fitness function that's not accurately mapped onto "keeping humans alive"...
Since when was this ever the case? Especially in America? The US healthcare system is NOT built around providing adequate care for everyone, as far as I've read/heard.
It was always like this. In my opinion it doesn't make a difference if some guy is more intelligent and is therefore able to suppress others or if he uses an AI that is more intelligent. For me the result is the same: I get rekt.
There exists a pain threshold you dont want people to cross due to your automation, otherwise your developers/executives risk being killed by customers. Luddites, Ted Kaczynski, Nasim Aghdam, etc.
Google/Apple shuttle buses are being shot up with pellet guns today, imagine what happens when big AI corps openly work against population. Google AI/Amazon Rekognition protests suggest at least some employees have a shred of self awareness and survival Instinct.
For more detail plus working code, lesson
4 of the fast.ai course uses this technique to obtain (what was at time of writing) a state of the art result on the imdb dataset:
By training a language model on the dataset, then using that model to fine tune the sentiment classification task, they were able to achieve 94.5% accuracy
Well spotted - this is where I first created the algorithm that became ULMFiT! I wanted to show an example of transfer learning outside of computer vision for the course but couldn't find anything compelling. I was pretty sure a language model would work really well in NLP so tried it out, and was totally shocked when the very first model best the previous state of the art!
Sebastian (author of this article) saw the lesson, and was kind enough to complete lots of experiments to test out the approach more carefully, and did a great job of writing up the results in a paper, which was then accepted by the ACL.
For anyone just getting started on this I can't recommend fast.ai enough. It's extremely well done and I found it very intuitive. You are able to quickly apply some very advanced techniques.
I want to like fast.ai more, but in my opinion their code quality is just not good enough. Everything is badly named, badly explained / not explained at all, and if you need to adapt any of their code to work in different domains, you are going to have a bad time.
It's fine if you don't like it (although it may be you're just not quite used to it yet), but I'm not sure it's fair to call it "bad".
Everything is fully customisable in pytorch and this is explained with many examples in part 2 of the course. Written documentation is being worked on as we move towards a first release later this year (currently the library is still pre-release - it works fine and is used at many big and small companies, but there's still much to do to get it to a v1.0).
I agree with the poster you're replying to - the naming is not intuitive or approachable without that previous body of knowledge. It's enough that I would call it bad - I would flag the crap out of it in a code review.
I'm glad you linked to the style guide - i've often wondered where certain names come from.
It's not all bad. K for key, V for value, i for index... no disagreement. But the seeming aversion to any variable name longer than 2-3 characters might be great in mathematics, but makes for a nightmare of code which is focused more on writing. It doesn't need to go the extreme of UIMutableUserNotificationAction; but at least use words.
AI is already clever enough; it doesn't need to be made more cryptic with poor naming conventions. I get that for people in the industry it may be fine - but for it to break out into general use it'll need to be more social; which means simpler & clearer verbage.
> It's fine if you don't like it (although it may be you're just not quite used to it yet), but I'm not sure it's fair to call it "bad".
It is totally fair to call a naming convention where all variables are 2 or 3 letter acronyms bad, at the very least un-pythonic, and certainly not suitable for an educational tool. If you think that it's good practice to make your code shorter at the expense your users needing an abbreviation guide, you're going to make adoption a much harder sell, especially if you're pitching at python devs. Unlike the last few decades, we now live in an age of autocomplete, there is absolutely no need for this.
The cognitive load put on someone unfamiliar with your code is not acceptable:
LanguageModelData takes two LanguageModelLoaders as arguments, and then later in the code it produces a model? Surely these two classes are named the wrong way round? You haven't explained what 'bs', or '1' are supposed to be. To figure out/ remember what the other variables are, you need to chase up and down the notebook, which would easily be avoided if you just gave them descriptive names. You only use these variables once or twice, so I don't even understand what is gained by making their names so terse.
Regarding the short variables names, I actually like it a lot. It's nice because you don't have to read as much.
Once you are familiar enough with the abbreviations, it becomes much easier to read. Just from looking at this code, I can already tell you bs is batch size, bptt is back-projection through time. And then probably dl is data loader, lm is language model, md is model data. No idea what vs or 1 is but I can just look at the docs for LanguageModelData, which takes what? less than 30 seconds to read?
I think it's worth it because you can just look at one line of code and know what's going on. Instead of parsing through a paragraph of code that does the same thing.
fast.ai is great and I also want to love it but I find the coding style extremely unconventional and uninviting to the point that I only use the library when there is really no better alternative (ex for the lr finding features). Variable naming and wildcard imports would be my main complains. But thanks a lot Jeremy for the course and for posting the style guide.
Do you have links for these pretrained models? The only one I am aware of is OpenAI's where they fine tuned a Transformer architecture for 1 month on 8 gpus:
>It is now possible to grab a pretrained model and start producing state-of-the-art NLP results in a wide range of tasks with relatively little effort.
Are there any applications/websites where this can be seen in action? It's increasingly hard to judge how good state-of-the-art really is from research papers.
5 yrs ago (mostly for fun) I tried out the 'state-of-the-art' - at the time - NLTK sentiment analyzer to correlate stock market changes with a variety of news/info sources.
I put it on the shelf because the sentiment analysis just wasn't up to snuf (i.e. the bias differentiation was too weak).
Not to be too obtuse, but isn't WordNet (you know, the project that inspired the creation of ImageNet) "an ImageNet for language"? It seems kind of weird to bring up ImageNet within the context of NLP and not mention WordNet once.
WordNet (as you probably know) is a database that groups English words into a set of synonyms. If you consider WordNet as a clustering of high-level classes, then you could argue that ImageNet is the "WordNet for vision", meaning the clustering of object classes.
The article uses a different meaning of ImageNet, namely ImageNet as pretraining task that can be used to learn representations that will likely be beneficial for many other tasks in the problem space. In this sense, you could use WordNet as an "ImageNet for language" e.g. by learning word representations based on the WordNet definitions. This is something people have done, but there are a lot more effective approaches.
I hope this helped and was not too convoluted.
I don't think WordNet has been much of a thing in NLP, especially nothing like ImageNet has been in CV. WordNet is only simple word-to-word relationships. "NLP" tends to denote more syntactical, phrase- or sentence-level text analysis; bag-of-word tools like WordNet or TF-IDF are not often considered "true" NLP, but might be called text mining instead.
The phrase "Imagenet moment" is generally used to refer to the success of deep learning in the ILSVRC 2012 competition, which used the Imagenet dataset. This is the case in this article.
TLDR the standard practice of using 'word vectors' (numeric vector representation of words) may soon be superceded by just using entire pretrained neural nets as is standard in CV, and we have both conceptual and empirical reasons to believe language modeling is how it'll happen.
Helped edit this piece, think it is spot on - exciting times for NLP.
>> In order to predict the most probable next word in a sentence, a model is
required not only to be able to express syntax (the grammatical form of the
predicted word must match its modifier or verb) but also model semantics. Even
more, the most accurate models must incorporate what could be considered world
knowledge or common sense.
So, the first sentence in this passage is a huge assumption. For a model to
predict the next token (word or character) in a string, all it has to do is to
be able to predict the next token in a string. In other words, it needs to
model structure. Modelling semantics is not required.
Indeed, there exist a wide variety of models that can, indeed, predict the
most likely next token in a string. The simplest of those are n-gram models,
that can do this task reasonably well. Maybe what that first sentence above is
trying to say is that to predict the next token with good accuracy, modelling
of semantics is required, but that is still a great, big, huge leap of
reasoning. Again- structure is probably sufficient. A very accurate model
modelling structure, is still only modelling structure.
It's important to consider what we mean when we're talking about modelling
language probaiblistically. When humans generate (or recognise) speech, we
don't do that stochastically, by choosing the most likely utterance from a
distribution. Instead, we -very deterministically- say what we want to say.
Unfortunately, it is impossible to observe "what we want to say" (i.e. our
motivation for emitting an utterance). We are left with observing -and
modelling- only what we actually say. The result is models that can capture
the structure of utterances, but are completely incapable of generating new
language that makes any sense - i.e. gibberish.
It is also worth considering how semantic modelling tasks are evaluated (e.g.
machine translation). Basically, a source string is matched to an arbitrary
target string meant to capture the source string's intended meaning.
"Arbitrary" because there may be an infinite number of strings that carry the
same meaning. So what, exactly, are we measuring when we evaluate a model's
ability to map between to of those infinite strings chosen just because we
like them best?
Language inference and comprehension benchmarks like the ones noted in the
article are particularly egregious in this regard. They are basically
classification tasks, where a mapping must be found between a passage and a
multiple-choice spread of "correct" labels, meant to represent its meaning.
It's very hard to see how a model that does well in this sort of task is
"incorporating world knowledge" let alone "common sense"!
Maybe NLP will have its ImageNet moment- but that will only be in terms of
benchmarks. Don't expect to see machines understanding language and holding
reasonable conversations any time soon.
I think in the future we'll see their AI fighting against our AI in an arms race similar to the spam wars. The one with the most computing power and biggest dataset will win and humans will be at their mercy.