It's all subjective, but as a data analyst I'm excited about probabilistic databases. Short version: load your sample data sets, provide a some priors, and then query the population as if you had no missing data.
Most developed implementation is BayesDB[1], but there's a lot of ideas coming out of a number of places right now.
Sounds like in many applications of machine learning (I'm thinking mainly of the swathes that name-drop it on a landing page, and probably usually mean 'linear regression') this could replace the brunt of the work.
e.g. store customer orders in the DB, and query `P(c buys x | c bought y)` in order to make recommendations - where `c buys x` is unknown, but `c bought y` occurred, and we know about 'other cs' x and y.
The way I see, the real utility comes from the fact that domain models such as those in a company's data warehouse are typically very complex, and a great deal of care often goes into mapping out that complexity via relational modelling. It's not just that c bought x and y, but also that c has an age, and a gender, and last bought z 50 days ago, and lives in Denver, and so on.
Having easy access to the probability distributions associated with those relational models gives you a lot of leverage to solve real life problems with.
Would you be so kind as to provide several introductory articles to probabilistic matching of data? Fuzzy searching, most-probable matches, things like that?
The agent modelling that I'm aware of is in simulation. I have a feeling that there would be a lot of interesting duality between the fields of agent based simulation and monte-carlo based probabilistic modelling, but I don't know enough about the former to say off hand.
ABM is an MC method, because different individual agents randomize their behavior based on distributions associated with possible courses of action defined by their agent type.
Most developed implementation is BayesDB[1], but there's a lot of ideas coming out of a number of places right now.
[1] http://probcomp.csail.mit.edu/bayesdb/