Hacker News new | past | comments | ask | show | jobs | submit login
The Secretary Problem (wikipedia.org)
202 points by fofoz on Aug 19, 2022 | hide | past | favorite | 120 comments



I found a home to purchase by implementing this.

I visited two places to establish criteria, then built a simple search across Greater London on Rightmove using area searches with proximity to Waitrose. The results I then filtered again based on price per square foot and rejecting certain keywords (aiming for a price per sq ft between £500-600 without needing renovation, but looking for no chain + quick sale).

As soon as a high quality and low price to square foot came up I put in an offer without having viewed it (I did view it once before completion).

I live there now... It's a gorgeous apartment purchased about 15% below market rate in a nice area.

Implementing a basic solution to the secretary / optimal stopping problem meant that with criteria that took out the vast majority of average places, I stopped on the first one and committed to it.


That doesn't implement the algorithm nor does it meet the problem because you were able to evaluate all options and not one at a time (whereas you would need to make a irreversible decision on the spot).


It doesn't implement the algorithm but not for the reasons you described. The fundamental issue that this algorithm solves is whether to keep waiting for something better to pop up or accept what you have now. The fundamental constraint is maximum number of trials you are willing to do. So, if you are looking for rental apartment, you might say that you may keep looking for each day for N days. On day 12 when you find the best apartment you have seen so far, the question is if you should commit to it or keep waiting for better in coming days. One of the conclusion of Secretory Problem is that you should not commit to anything until sqrt(N) and then you should pick the one that is the best you have seen so far. This maximizes the goodness of the apartment you might select. If you want to select the best possible apartment then reject first 37% (also called 1/e rule) then select the best you seen so far. This maximizes the probability that you will select the best of the best out of N but this probability is only 37%. The problem with this approach is that there is much higher chance that it wouldn't beat the quality from square root approach.


This isn't the way house hunting works in most markets today (well, at least up until a few months ago). You need to consider each property as it comes up and decide whether to make an offer or not.

Either you use broad filter criteria and you look at a few houses before you say "this is a better option than what we've seen, let's make an offer". In which case you can ignore the filter for purposes of fitting the house search to the secretary problem.

Or like the parent post, you think you can express what you're looking for exactly through facts that appear in the ads, in which case you look at a few houses, then you buy the first house to appear that exceeds best of what you've seen.

What you do not have in most markets is the option to go back and buy a house you looked at two weeks ago. Houses that sit on the market aren't houses that are appealing to buyers who intend to live in them.


What happens if the best apartment you could have looked at was the first one, which you rejected because you hadn't seen enough by that point.

Following the above heuristic, isn't there a chance you never look at another apartment that would be "better than the others you've looked at" once you enter the decision window, and therefore keep looking until you reach N and have to take the last option?

(the implications for dating should be obvious here, with it being fairly common for young lovers to break up so they can "see what's out there", and not uncommon for people to reach an age where they feel the need to settle for "what they can get" at that point)


> Following the above heuristic, isn't there a chance you never look at another apartment that would be "better than the others you've looked at" once you enter the decision window, and therefore keep looking until you reach N and have to take the last option?

Yes, obviously. As Wikipedia says:

> The probability of selecting the best applicant in the classical secretary problem converges toward 1/e ~ 0.368.

The classic solution to the classic secretary problem works in less than 37% of cases.

But in reality, you tend to know a bit about the distribution beforehand, and you are also interested in getting eg the second best secretary (if the best is no longer available), instead of just going for best-or-bust.


Hot housing market with constant stream of options and where the good ones are barely on the market and require an instant decision and full commitment to a quick decision.

Unsure what part doesn't meet the criteria, I consciously used that approach to make the decision.


The key part of the Secretary Problem is that you know the number of applicants, and use this to determine how many candidates to examine (1/e of the total under the optimal policy) to determine the selection threshold.

Instead, it sounds like you arbitrarily decided to use 2 examples to pick a selection threshold for the "price to square foot" at which you'd accept a property.


1. You don’t know how many candidates there so you couldn’t apply the criteria.

2. You established a threshold of acceptability and took a candidate that met that threshold rather than saying, after X properties, I’ll take the next property that’s better than the best so far.

Your approach is much closer to the standard method of buying a house - you have constraints, needs, and wants, and buy a house that fits those.


One of the criteria is a finite, pre-known number of options. In this scenario, you're never forced to buy a house simply because it's the last one and if you don't buy it there's never going to be more houses available.


Set some timeframe you'd like to buy a house in, x months. For small enough x, you can choose some constant rate of houses coming on to the market per month y. xy is your finite number of options.

The question then becomes: how much do the simplifications made affect the outcome of applying what we know about the secretaries problem to the real world situation? Someone could probably write a neat article about it.


This is obviously artificial because after x months, you are not going to buy something much worse than previous things you've rejected just because it's the best remaining option, rather than wait one more month. But having to do that is the crux of the secretary problem.

OP has not understood the significant differences between this clearly defined mathematical problem and his own experiences, not sure why people are indulging him.


It's perhaps important for people to understand that this type of problem is not necessarily a problem of hiring secretaries but rather a mathematical problem presented in less abstract form that people can easier relate to and digest. There could be dozens of variations, one could involve being able to go back one decision point (change your mind on the previous candidate), one could be to use parallelism and multiple evaluators, one could be to not know the total number so you might have seen the last secretary ever, etc.

Most real life problems will never perfectly match to this kind of math problem simply because real life usually offers more flexibility than a strictly defined mathematical scenario.

Reminds me of the joke with the mathematician and the engineer who can take as many steps as they want to get to the pot of gold in front of them with the condition that every step is maximum half of the distance remaining. The mathematician is absolutely livid because he knows he'll never reach it, while the engineer is happy because he'll be there for all practical purposes.


Crucially, the optimal solution depends on the specific constraints. So you can’t just take a solution derived under one set of assumptions and use it in another situation without losing whatever guarantees it made in the original case.


The engineer is happy because they know to bend forward and reach with their arms.


You raise a good assumption I think is being missed - the choice criteria has not changed between the time of interviewing the first applicant and selecting the n/e-th applicant. The housing market is too volatile for this assumption to hold.


Finite yes... As otherwise you wouldn't have a stopping problem as you could preselect from all options. What I didn't know is how good each option was, whether to wait for better or not.

n is finite as time provides the limit to n. We were looking to move within 6 months (this was a hard requirement based on personal circumstances) and I'd already established the number of likely options according to the rate they appeared on the market.


If fewer than 8 places would meet your criteria from among the current inventory plus what comes onto the market in the next 6 months, then you may have used the algorithm.

In all of the part of greater London you considered, 7 or fewer possibilities seems quite low, but I don’t know how tight your criteria were in practice relative to that market. If your criteria were that tight, I congratulate you on securing 1 of the 7 suitable properties.


I'd say that's true of almost every real world example normally discussed. It is quite common for "buyers" to put multiple candidates on ice for as long they can possibly leverage to do so while they evaluate their options, be it in dating, hiring, or purchasing.


Think of the batch-search performed as „one secretary“. You either need to take what that search offered you, or you need to wait for the market to change enough in order to perform the next batch-search.


This is the same problem, but your algorithm was not n/e.

If there are 100 homes on the market, doesn't the optimal algorithm suggest

100/2.718 => 37 homes must be assessed

after which you commit hard to the first home that would be above all of those top 37?

I mean: you can't argue with the results. You underpaid and are delighted with your home! But you share the problem space without having implemented anything close to the ideal solution. In this case your intuition and sophistication resoundingly /trounced/ the ideal solution in terms of efficiency.

So you won!


> If there are 100 homes on the market, doesn't the optimal algorithm suggest

It's not about the number of homes on the market, it's about how many homes you want to look at total (which is capped by the number of homes on the market). So, you say that you want to look at 10 homes max. You take 37% of either quantity as the evaluation period. After the evaluation period (3.7 houses) you commit to the first house that's better than the best you saw in the evaluation set.


People in this thread also seem to not be aware that the evaluation period can be measured in time as well (as long as the candidates come in uniformly at random, which depending on the time of year, doesn't often happen with houses).


> with proximity to Waitrose

I guess this was a "sufficient affluence" indicator to filter out undesireable neighbourhoods?

(for readers who don't know: Waitrose is a British high-price-segment supermarket)


Does that actually work? Just because well-off people shop in Waitrose, does that also mean that houses around Waitrose don't include old council estates?


It would be somewhat reliable, although - depending on proximity - perhaps not that different from simply drawing a sector around south-west London: https://www.google.com/maps/search/waitrose/@51.505609,-0.33... .

The rub is that this area is not well-connected to the tube, at least for another 10-15 years.


There's a single Waitrose south of the river. Huh.

EDIT: Okay, 2. Missed the one in Greenwich. Still huh.


Most old council estates are privately owned now. Many newer ones are also being sold as shared ownership.


I can't speak for Waitrose specifically, but luxury brands will definitely do a comprehensive analysis of the surrounding area before opening a store. In the US at least, the equivalent is looking for the closest Starbucks or Apple store.


”Put the papaya down, Orlando!”


How did you find/calculate the floor area from Rightmove? In my experience listings _might_ have the area in the floor plan image, so you’d be hoping for floor plans then running OCR to find the area? But I’d expect that to often go wrong or not find anything (missing or duplicate floor plans etc). Would love to know as it’s infuriating that the UK market is based on bedroom count rather than floor area.


I live in Scotland. The seller has to provide a "home report" which contains this information. It's fantastic!



Bedroom count is ridiculous, it's an estate agent's marketing game whether a room's a study or another bedroom.

Should just be #kitchens, #bathrooms, #loos, #other IMO. Number of fitted/room-like cupboards etc. would be a bit useful too, but size so variable it would only give a very rough idea.


You didn't implement anything like this, but congrats on finding a home you're happy with using a different algorithm.

You only saw 2 homes -- an arbitrary number not at all tied to how many homes might be on the market. You then instantly chose the first candidate you saw above some arbitrary threshold. I hope you can see this is nothing like the optimal stopping problem outlined in the article.


I did similarly over a few months trying to choose a new apartment to rent. I learned the area and all the types of places that might be available and when one came up that was ideal I jumped on it even though it was in a very low season for apartment availability.


does rightmove allow filtering by price/sq foot?


It's a really cool result when you tell people about it, but actually when you think about it, it doesn't seem as useful as it sounds at first. The combination of some fundamental assumptions - you know the number of candidates in advance and you can't go back to previous candidates (*edit - should have added another one here: that you don't know anything about the distribution from which candidates are drawn before you begin) - don't really seem to match any of the purported applications, nor any I could think of quickly off the top of my head.

In reality, you're usually be able to gauge some estimates of the moments of the population distribution before you have to make any decision (which also aren't always final). Ie, you'll be interviewing a couple of candidates before you send out any rejections, you'll see several potential mates before you narrow in on one, you know from experience what a nice appartment looks like when you visit one,...

There are also other problems in other potential applications: a gig economy worker needs to not only optimize the value of the next gig but also the time spent searching for the next gig, when buying a used car the biggest problem is the information asymmetry regarding the true value of the car, not that it's gone if you don't buy right away,...


The even more broken assumption is that the only success factor is to get the single best result. That's not how real life works either: even if the best one got away, there will still be many other options close to it in quality, you'll be perfectly satisfied with any candidate in the top-five percentile or similar.


The traditional example of an application is dating to marry.

The number of 'applicants' isn't what matters, what matters is how many dates the subject can get in the amount of time they're willing to spend looking, this can be estimated reasonably well.

The part where you can't go back, well, it's more true than not.

I say it's the traditional example of an application, but I rather doubt people choose their spouses this way, nor that they should.


My point is that the model assumes that you're starting with zero knowledge about the population when you start dating, that's why you need the burn-in phase where you reject everyone to get a good estimate. In reality, I'd think you'd have a pretty decent prior for the population parameters before you even start dating.


I'd disagree for the dating case; you know a lot about people, but you don't know what its like to spend 6 waking hours with someone every day for awhile until you try it, you don't know what the sex will be like, you don't know what being a whatever-in-law will be like. All things that have to be sampled over a period before you can build up a distribution.

Plus, for the dating especially, there's a lot of "je ne sais quoi" which requires sampling and can not be polled from the population at large.


You don't need to know number of candidates if you know the total amount of time (and assume candidates come in uniformly at random).

The most practical use case is selling a house.

There are also variations of the problem which model being able to go back to previously rejected candidates, as well as a probability that the candidate rejects you.


Right, selling a property might actually be a fairly good application of this model, I agree - you shouldn't leave a property listed for too long, and together with the feedback from your agent you should get a meaningful estimate of how many offers to expect in that period.

I'd be curious to see the agent's reaction when someone actually implements that and flat out rejects all the first offers.


It's a garbage premise entirely due to the fact that people - eg job applicants - can be so drastically different one to the next (whether amazing or horrible). The secretary problem fails to bake that in, because it can't handle it inherently (you have to actually interview people to know that). It doesn't actually work in reality.

There's no probability theory that is going to function properly so as to be very useful in such a context, because the best candidate in the world can just as easily show up on the tenth interview as the first or third. Your first three job applicants can be horrible, and the next three can be amazing.

People that buy into the false nature of the secretary problem / optimal stopping theory (that it's actually useful in reality), are falling for the premise that eg several roulette spins in the past directly impact the odds of the present spin. It's a bad premise to utilize for exactly the same reason: the quality of your last candidate doesn't have any strict relationship to the quality of your next candidate.

Optimal stopping theory is only useful for highly indecisive people that can't figure out what they want, so they fall back to handing off their decision-making to something else.


None of what you said is consistent with the derivation of the policy. Note the policy is probabilistically optimal, not guaranteed to hire the best candidate every time. If you're hiring 1 person, sure you might end up SOL. But if you were hiring 100, this method would get you the best workforce.


This can be directly applied to wholesale auto auctions


Hmm, for auctions I think you really need to also the other agents into account (auction theory builds on game theory). The stopping-time model doesn't have strategic interactions.


I don't know what the theory says but giving your bidders this rule of thumb works better in practice. Or at least it did some years ago


What does the rule of thumb even say? How high should they bid when they see an attractive lot?

If your rule of thumb says that you should only start bidding after a third of the auctions are finished, I hope you see that there's a pretty obvious way for others to arbitrage against you.

Auction theory studies both optimal bids and optimal auction designs - imho it's one of the more useful areas of economics: https://en.wikipedia.org/wiki/Auction_theory


I get that it seems implausible that there's just money sitting around waiting to be picked up, but it was there. Used autos is a hugely fragmented industry that is only slowly consolidating. Bids + bid strategy are likely still extremely far from optimal. Probably more easy wins out there.


I believe you, sounds like you're actually doing some business in that area. You should really look into the theory then. Feel free to reach out if you need help.


Econtalk had a recent episode [0] that discusses this problem and the limitations when applied to real life. There's also a transcript [1] for those who prefer reading.

I found the discussion (and the entire episode) to be a nice reflection on how applying rules/logic/algorithms to big, deeply human problems doesn't always work. While the secretary problem is a fascinating and unintuitive mathematical result, as many other commenters have pointed out, it just doesn't have that many strict uses in people's real lives, outside of the general idea of "try a few of X to get the idea of the field, then pick one".

[0] https://www.econtalk.org/russ-roberts-and-mike-munger-on-wil...

[1] https://www.econtalk.org/russ-roberts-and-mike-munger-on-wil...


I'll sheepishly admit I'm so low on time that I can't listen to Econtalk as much as I'd like these days.

Knowing Munger, I'll assume they talked about the heuristic's application in automated systems and it's prefect application there?

Also, dear HN reader, if you don't listen to EconTalk already, you're missing out on one the best podcasts out there.


This was a great episode, I came here to post it but you beat me. Have ordered the book which is published a bit later in the UK but I'm looking forward to reading.


It is this useful in real life?

You usually don’t actually need the best from an arbitrary sample, you need better than $threshold.

Aiming for best from sample when you have $threshold already could just result in opportunity cost.

Be it in hiring, dating, researching holiday destinations or anything.


This is useful when you don't have a good prior for where to set your threshold. If you have a firm threshold, yes, you're right that it isn't all that useful.

The form of the solution also suggests a "good-enough" variant, useful when speed or simplicity is more important than optimality: reject the first option you find that genuinely meets all your needs. Accept the next thing you find that beats that first option. It is kind of surprising how often having this sort of approach in your back pocket can come in handy.


If you are threshold hunting, how do you set n? Or is n set based on resources. Would be interesting to see if there are n setting heuristics. Maybe based on how fast the best is getting better in a small sample?


n depends on your resource budget. Your resources might be limited by availability (how many secretary applicants are there), time (how long does each test take vs how much time can you invest), etc.


But your budget might be flexible depending in what you get out of it. For example you have cashed up investors.


Not really. The problem as described has two issues that makes it unlike most real problems:

1. The problem only deals with optimising finding the best candidate, the "optimal" solution therefore has a high probability of picking a really bad candidate. In reality this failure mode is quite unacceptable.

2. While many real problems will have some opportunity loss by waiting, it is quite extraordinary that options are presented in such a linear fashion. In a real hiring situation you'd generally be able to go back to a candidate you interviewed earlier and hire them.


Regarding 2:

This may seem somewhat contrived, but I tried to come up with some tech analogy.

Let's say that you have some analyzer which takes in a finite amount of physical matter, and then produces some dataset. The analysis is a destructive process, so that the analyzer will only be able to produce N different datasets - until it runs out of the physical matter it is analyzing.

The datasets are huge, and will take up all available memory. Before the next dataset is finished, you need to decide whether to store or delete/discard the current. The permanent storage can only hold one of these datasets, and it is impossible to write over it. Once you have stored a dataset permanently, the next ones will get discarded as soon as they've been generated.

Or something like that.

Edit: Wouldn't surprise me if there's some embedded system / machine out there that's bound to select stuff like that.


It's even worse than (1), the optimal solution has a high probability of picking no candidate at all.


Some versions of the problem would have you just pick the last candidate then.


Not picking any candidate at all doesn't matter if you are just trying to optimize your probability of picking the best one.

Choosing one which you know not to be the best is not considered any better than choosing none.


> 1. The problem only deals with optimising finding the best candidate, the "optimal" solution therefore has a high probability of picking a really bad candidate.

What's "really bad" here and what is the probability of picking a really bad candidate?


For a large number of candidates it ends up being approximately 1/e (37%) chance of picking the best candidate, 1-2/e (26%) chance of picking another top candidate, and 1/e (37%) chance of picking a candidate of random skill. That last case occurs when the best candidate was among the first 37% of candidates, which are always rejected, then you run through the rest without finding a better one, and thus end up with the last one interviewed.

If you somehow still have to immediately hire or reject all candidates and care about expected average, instead of just finding the best candidate. The optimal solution will have you reject fewer candidates outright, and then have the criteria for how good a candidate should rank to get hired lower gradually throughout interviewing, and drop sharply towards the end of the candidate list.


The example I first heard about this problem for was with portapotties. At what point do you stop looking through the line of open portapotties to find the cleanest one?

So I guess that's a use.


But that’s the best example of thresholds. You’re not going to ever find a perfectly clean one, you’ll obviously choose the first one that meets your standards.


This particular problem I would guess not, but the more general family of problems that is optimal stopping I would expect so.


It's useful if you don't know what $threshold is or if you want to get more "bang for your buck".

I believe there's a chapter in Freakonomics that deals with this problem and, if I'm not mistaken, talks about a situation where you might want to figure out some of the best places to eat in town without needing to visit all the places in town.


Applied to the problem of picking a toilet at a music festival: https://www.youtube.com/watch?v=ZWib5olGbQ0 .


I've seen thousands of toilets in my life, my policy to finding a toilet in a music festival would be to choose the first one that's "good enough", no need to choose the best one.

In my point of view, this policy is more relevant if you have no prior knowledge about the subject you are evaluating. Even when choosing secretaries, how often it is that you want to find the absolute best one?


I love the secretary problem. For me the value is in illustrating the idea that there can be a good stopping place even with uncertainty. It's a great antidote to the grass-is-always-greener fallacy that can keep one on the indecision train for far too long.

I don't use any of the math, but my simple version is to look at a reasonable* number of options and then pick whatever next one comes up that is reasonably* high in the distribution.

*A lot of gut feeling goes into this.


Entering my thirties still single, I think about this a lot. First read about it in Algorithms To Live By.


I heard a talk by Esther Perel (recommend her TED talks, numerous interviews and podcast appearances) where she spoke about the differences in mindset when choosing a partner between 30-40 years ago and now.

“Before”, one would choose (or even be strongly recommended) a partner, commit, and make it work.

Now (culturally), one expects to find the absolutely perfect person/flavor and have everything worked out at the time of selection.

You can see how such an attitude would drastically limit the choices.

Another difference is that nowadays partnership is expected to include that person being the perfect lover, perfect co-parent, best friend, partner in crime, sometimes therapist etc. while in the past the expectation that the romantic partner would also be someone to entertain us or always understand us or share hobbies etc. was much much lower.

obviously not absolutes but more of a cultural slider which limits choices and puts more stress on relationships.


> “Before”, one would choose (or even be strongly recommended) a partner, commit, and make it work.

Because divorce was not on the table, so you had to "make it work", even if in the end was just a toxic relationship, where couple in their 60s or 70s plainly hated their spouse.


>Because divorce was not on the table, so you had to "make it work"

This isn't 1950. Divorce (and even easier, separation) has been "on the table" for a long time. Yet the culture around dating and marriage today is miles away from decades ago.

You can tell yourself old couples hate each other, because of course some do. But people have been happily "making it work", across cultures, for centuries. It's about the expectations you have in a partner and what you want from the relationship. If that's made-for-TV-movie romance until eternity, good luck.


I'd like to see some paper/book about 'But people have been happily "making it work", across cultures, for centuries.', proving that the majority of couples were happy in their golden years. But we should start defining golden years because in many civilizations and ages, lifespan was certainly shorter on average.

Edit: typo


I think one of the differences is in the definition of “happy” and the expectations around it.


And like in all things, there is probably a better middle ground. Don't commit forever to a horrible partner, but also don't look for the "perfect person" with whom there will be no difficulties. Neither is good. Look for the "good enough" partner and really try to make it work.


Absolutely. And don't be afraid to end a relationship that's not working, learn from it and be a wiser/better partner in your next one (if you are interested in having a relationship again).


Seems like marraige (and to some degree the employment form) would complicate matters because both sides are looking for the optimum.

What happens if the secretary you decide to hire is just using you to calibrate their search for what a good boss looks like?

Feels like there would be some interesting maths in that.



Interestingly, the traditional algorithmic solution is inherently asymmetric: it gives better outcomes to one gender than the other.


Are you basing this on math or history?


This is a mathematical statement, see the end of the Wikipedia section "Algorithmic solution", where it states "Among all possible different stable matchings, it always yields the one that is best for all men among all stable matchings, and worst for all women."


I've also thought about that. At first N seemed daunting, but then I added in a bit of the drake equation (https://en.wikipedia.org/wiki/Drake_equation) style thinking.

Humanity has 33 bits of entropy. So with 33 pieces of information about a person that don't depend on each other can very likely uniquely identify any person. Individually a person is likely to meet 10000 different people in their life, that's 14 bits of entropy if you start there.

Male/Female, College educated/not, looks, substance use, desire for children, compatible hobbies, compatible diets, compatible cultures, compatible monetary beliefs, compatible sexual preferences, my own level of attractiveness. Each hard preference has a corresponding reduction in dating pool size.

Clearly I am also single.


This is why the old dating app approach was so powerful — you’d answer dozens if not hundreds of questions, and rank them in relative importance to you as well as the relative importance and allowable values of your partner’s answers.

I immediately threw out anyone below 85-90%. My dating pool in a dense urban area over a period of months was tens of people. Once filtered for attractiveness and mutual interest, I was down to single digits. I married the first 99% I went on a date with.

Unfortunately (for others) I don’t know if that method is still available anywhere but the app I was using got acquired and reduced to swiping left and right.


this made me chuckle, i too think like this and yes im single


Seems like you haven’t found the right secretary yet.


  > The probability of selecting the best applicant in the classical secretary problem converges toward 0.368
There are also some pretty big tradeoffs with this strategy. There seems to be a roughly 1/3 chance that you already rejected the best candidate. That's all good if not coming up with a result is acceptable and practically free of cost.

If the stakes are higher though, more conservative strategies might be in order. For example, you might want to sample a much smaller group than n/e first and make a decision based on their median, or even look at their standard deviation to get a reasonable sense about the value distribution - as in the real world values are seldomly distributed uniformly.

If there is a cost to coming up empty, you may also want to employ a hybrid solution, such as going with "1/e-law of best choice" until you hit the 70% mark and then accept the first candidate that is higher than a linear interpolation of the highest value and zero, lerped across the remaining percentage of samples.


You know what I can't accept about statistical probability? While it does indeed acurately measure past events, to accept it you also have to accept that outcomes are not random but somehow implementing causal events a certail way like in the solution to this problem will result in more likelihood of desired results.

The only thing that this tells me is there is an underlying causal relationship. Otherwise, what is the alternative? That all events in reality are a complete throw of the dice and probability just makes sense of how things were pre-determined?

Another issue I have is (taking the problem at hand for example), the applicants are ordered randomly right? Thay discards a key piecie of information on their ordering (first applicants want it most or they are undesirable for example).

And when you randomize, that simply means you discarded the information about your new reordering. In reality, your arbitrary shuffling has a cause. If I asked you to pick a number at random, there is a reason why you arrived at that number and that reason or cause also has cause and so on, so nothing that is a product of cause can be truly random.

Last point for this post, what does the pool of candidates look like? keeping it simple, 3 people apply, since you shuffled them, the desparate first applicant is middle,the mediocre last applicant stays last and the talented person that applied a bit late is first. So you get bad or mediocre. Since you get to shuffle only once, this ordering I mentioned really has two groups which are desirable top talent person and undesirable mediocre or bad hires. So if your shuffling is truly random it is similar to a coin toss where probability is 50%. My point with this is, in reality not just in hiring but in many other things the vitality curve applies where you really only want the top 10/20%. It might work if your goal is "the best of the mediocre middle" but if your goal is binary "top 10%" vs "bad hire" the implication here is your sampling of the first person is the equivalent if a coin toss but the coin toss here was really your one chance shuffling of applicants. Does this make sense? The solution makes sense to me if maybe you kept reshuffling their order after each interview.


You may want to check out https://en.wikipedia.org/wiki/Causal_inference - the issues that you point out are all well understood and being addressed as well as possible within the limitations of our current knowledge of statistics, the logical fundamentals and the available data.

Ignoring ordering for example isn't a problem with stochastic modelling per se, but just with this specific model when applied in a specific context - you could of course use a different model to account for ordering as well. Similar for whether you want to optimize the expected average outcome or some other statistic like a given percentile. Those models exist already, don't blame the discipline for your lack of knowledge.


Thank you for the link. I suppose my whole point was that one should take the solution at face value when applying it in a real world scenario. Applying context aware critical thinking is mandatory to avoid applying the wrong solution to a problem?


Regarding the first two paragraphs.

If you came to these conclusions on your own, I applaud your critical thinking. The next steps however would be to search the literature, realize these aren't new ideas but instead have been discussed and solved, and learn them.


Thanks, I mean I did not think I made some new discovery without even the smallest education on the subject lol. The point I was making was how these considerations make me doubt the solution as optimal.

I am sure technically they are but in a real hiring process if talent supply is low, you should probably go with the first guy that meets or exceeds your expectations.


I had considered, but not tried, using this algorithm for a dark-seeking robot (perhaps a pedigree of the Braitenberg or BEAM style robots).

Knowing where the darkest place in a room is seems unknowable for a simple robot.

So take a random walk for perhaps 3 minutes or so while recording the darkest light level. Then wander for another 7 minutes stopping if you record a lower light level during the initial wander.

Within 10 minutes I would like to think your robot will have done better than average at finding the darkest place in a room.


Isn’t this a typical maze problem for a robot but the cost function is optimized for reward for darker spots? If that’s the case wouldn’t a-star hybrid or particle filters solve your problem?


Maybe. I'll have to look into the algorithms you mentioned. To me there was a Secretary Problem use-case because we don't know what the light amplitude range is - we don't know what "dark" is (and will vary at different times of the day, vary by room, etc.). Also it's simple enough that you can imagine even a cellular-size organism implementing something like it.


Can I assume all things remain constant so, for example, the quality of applicants doesn't improve over time as they finish their year at the local college?


Ha ha, you might be overthinking it.


Nah, it's just my way of saying this is a fancy way of taking a guess


Beautifully told in https://algorithmstoliveby.com


That article doesn't seem to explain why it is called the Secretary problem. Was it just because someone choose to illustrate it with a hiring a secretary example, or was there some peculiar reason this is or was the practice when it came to hiring a secretary?


The answer appears to be "sexism". The problem has a number of names, all of which seem to revolve around men choosing women. The original name is "fiancée problem".

(Secretaries were historically male, but by the 1940s when the problem was devised, it had become a largely female profession. It became less about being a trusted adviser, and more about operating a typewriter.)

I suppose it makes a vague kind of sense that you don't get to call back a potential "applicant" for marriage. You get to evaluate one at a time, and if you reject them, you don't get a second chance.

I don't know why that shifted to "secretary". Maybe it seemed marginally less sexist?


I've always heard it called the marriage problem or the secretary problem, both of which are gender neutral.


When you review candidates for a position you cannot know if a better one has yet to apply nor (it is assumed) can you go back to a previous candidate (the assumption I suppose is that they now have a job).

As the article points out, it is also known as the Marriage Problem for similar reasons. The name "Googol Game" is where I first heard about it (and I believe it was Martin Gardner that presented it as such).


Maybe because secretaries are "rankable" (a term used in the article). Perhaps by words-per-minute typing speed.


1-1/e is one of my favorite numbers. Not only is it the fraction of candidates rejected by the Secretary Problem, but it's also the probability of a one in N chance event occurring in at least one of N trials.


Interesting that we have this Turkish proverb: Allah'ın hakkı üçtür - Translation: God gives (ordains) three chances.

Almost matches the optimal chance for success rounded down.


The secretary problem (max odds of best) isn't the only problem in that space.

It's often good to go for a "within 10% of best" solution.


It seems to me that this would be a fair strategy (the immediate rejection or acceptance part, that is) if N is very large.

But for smaller and more manageable N, I would assume you're better off with just ranking them all, and work out some decision threshold afterwards - when you have all the information at hand.


From the article:

“If the decision can be deferred to the end, this can be solved by the simple maximum selection algorithm of tracking the running maximum (and who achieved it), and selecting the overall maximum at the end. The difficulty is that the decision must be made immediately.”


Given that you can compare all N against each other, and then select, which this assumes you cannot.


Hmm. So if you keep waiting for a better option, the chance of choosing the best one decreases after you've rejected 3.


No, that's not how it works. There is a chance that one of the first 3 candidates is the best one, but as the total number of candidates increases the probability of that being the case decreases.

Different failure mode for this algorithm is when you get your candidates presented in the "worst to best" order - it makes you set your threshold low, and then settle for an average candidate.


You're reading the graph wrong - the X axis is the number of available options. More options = less likely to pick the top one


Yeah that’s exactly what I’m saying.


Only if the « you keep waiting for a better option » and « have rejected three » description applied as well to the case when you don’t reject anyone and stick with the first candidate.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: