This is great work but I think the machine learning part is just thrown in for trendiness reasons. More rigorous statistical analysis should lead to a cleaner algorithm without relying on pretrained classifiers.
I'm a little more bothered by the glibness of moving from 'we have a good prediction/regression algorithm' to 'we can causally control and optimize connections using it'. When they claim in the abstract that
> Based on the measurement analysis, we develop a machine learning based AP selection strategy that can significantly improve WiFi connection set-up performance, against the conventional strategy purely based on signal strength, by reducing the connection set-up failures from 33% to 3.6% and reducing 80% time costs of the connection set-up processes by more than 10 times.
They don't actually demonstrate this and don't know that it does improve WiFi connection performance, because they don't benchmark it on any real-world devices. It's pure extrapolation from the algorithm's predictive performance on the dataset. (And there are some suspicious inputs to the algorithm like time of day.) When they write on pg9 about where they are pulling these numbers from:
> To evaluate, we first divide our connection log dataset into two parts, each subset contains 50% of the overall data. ...This fresh dataset ensures that we can accurately evaluate the performance of our algorithm if we deploy our algorithm in the wild, where many of the APs will not be seen by the mobile devices before.
They're just wrong. Splitting like this only ensures good out-of-sample performance when drawing from the same distribution, but when you use the algorithm to make choices, the distribution is different. Correlation!=causation; it's no more guaranteed to help than data mining hospital records and finding that antibiotics apparently are killing patients and so hospitals should stop using them.
In general, any opinions about where we are on the hype curve for machine learning (as a stand-alone concept separate from traditional statistical inference)?
It's not the first hype cycle, so we've already been through all phases, but now I guess we are still raising towards peak hype. The fact is that this time AI has real world applications and benefits, so it's not just a bunch of hot air. It will not have a crush at this level of hype yet because we now have lots of results that were unimaginable even 5 years ago.
We're breaking human level accuracy in vision, speech, text and behavior - on specific tasks, not in general yet. In the last 3 years neural nets have become creative - now they can create paintings (neural style transfer), images (GANs), sounds (WaveNet), text (seq2seq, translation, image captioning) and gameplay (Atari, AlphaGo).
All these are complex forms of creativity, as opposed to simple classifiers. So we have recent progress, there is no period of lacking results behind us. That's why we're still on the rise.
Well said, I would agree that we're just to the left of the peak but that applies more to the areas adjacent to the problems where deep learning has shown some real value.
I remember the previous AI peaks in the 80s and 90s as well the neural net and fuzzy logic euphorias. The problem back then was that the results were not that exciting. Now we have some really impressive applications searching vast datasets and recognizing useful features. However, I notice everyone is trying to apply the same algorithms to problems not well suited for that type of approach.
If the object is only to improve performance, a black-box machine-learning model should outperform a hand-tuned algorithm. The generalisation (out-of-sample) performance will depend on how representative the training and evaluation sets are, but this is also true for an algorihtm with human-selected features.