It's all subjective, but as a data analyst I'm excited about probabilistic datab...

OJFord · on Feb 5, 2017

Sounds like in many applications of machine learning (I'm thinking mainly of the swathes that name-drop it on a landing page, and probably usually mean 'linear regression') this could replace the brunt of the work.

e.g. store customer orders in the DB, and query `P(c buys x | c bought y)` in order to make recommendations - where `c buys x` is unknown, but `c bought y` occurred, and we know about 'other cs' x and y.

Is that sort of how it works?

siddboots · on Feb 5, 2017

That's exactly how it works.

The way I see, the real utility comes from the fact that domain models such as those in a company's data warehouse are typically very complex, and a great deal of care often goes into mapping out that complexity via relational modelling. It's not just that c bought x and y, but also that c has an age, and a gender, and last bought z 50 days ago, and lives in Denver, and so on.

Having easy access to the probability distributions associated with those relational models gives you a lot of leverage to solve real life problems with.

pdimitar · on Feb 5, 2017

Would you be so kind as to provide several introductory articles to probabilistic matching of data? Fuzzy searching, most-probable matches, things like that?

mikhailfranco · on Feb 7, 2017

BayesDB is being commercialized by startup Empirical Systems:

http://empirical.com (still dark atm)

co-founded by CEO Richard Tibbetts, who was also a co-founder of StreamBase (acquired by TIBCO).

pinouchon · on Feb 6, 2017

For more details, check out this talk by Mansinghka, one of the main contributors of bayesDB: https://www.youtube.com/watch?v=-8QMqSWU76Q.

abhishivsaxena · on Feb 5, 2017

Interesting. Do you know anything about agent modelling? Any ideas how if/it ties to it?

siddboots · on Feb 5, 2017

The agent modelling that I'm aware of is in simulation. I have a feeling that there would be a lot of interesting duality between the fields of agent based simulation and monte-carlo based probabilistic modelling, but I don't know enough about the former to say off hand.

mikhailfranco · on Feb 7, 2017

ABM is an MC method, because different individual agents randomize their behavior based on distributions associated with possible courses of action defined by their agent type.

ced · on Feb 5, 2017

What's agent modelling?

rch · on Feb 5, 2017

Here's an agent modeling framework I remember from the early 2000s:

http://jade.tilab.com/

I'm curious to know if it's related to the current discussion.

geokon · on Feb 6, 2017

Isn't that fundamentally the Netflix problem? Or am I missing something?