Minecraft diamond challenge leaves AI creators stumped

TheRealPomax · on Dec 14, 2019

It's a weird thing to somehow think that an AI with only 4 days of training data should somehow outperform even a human child, which has a decade or more of imitation learning experience. A human doesn't come to minecraft "knowing nothing", like the machine learning programs in this challenge. They already know basically everything there is to know in order to perform task-oriented goal planning and only need to focus on task discovery.

If anyone had pass the bar this challenge set, they'd have basically solved the biggest hurdle in general AI, and would probably have received a call from John Carmack.

YeGoblynQueenne · on Dec 14, 2019

That's kind of the point. The trend in machine learning right now is to try and solve problems "from scratch" by training "end-to-end", which means without any prior knowledge. For example, DeepMind's last three or so iterations of their Alpha-X architecture for game playing did not even have any prior knowledge of the rules of the games they tackled (chess, shoggi, go and Atari games) and were trained entirely through self-play and without anything like an opening book of moves etc, like the origional AlphaGo architecture (or DeepBlue and traditional chess engines).

That is very big in AI right now, to try and remove the need for the "human in the loop" and learn complex tasks from 0 background knowledge.

Obviously, like you point out, this is very difficult to do and it requires lots and lots of data and compute (no, more than that. Really lots). As the article reports, this eliminates any hopes of "democratising" machine learning and makes advancing the state of the art a game that only large corporations can play with any hope of winning.

semi-extrinsic · on Dec 14, 2019

> The trend in machine learning right now is to try and solve problems "from scratch" by training "end-to-end", which means without any prior knowledge.

In a strong sense, this implies ML is not "intelligent" in any conventional sense of the word. What little we have of understanding of learning and intelligence in the scope of humans strongly implies that drawing parallels with solutions to other tasks, building on previous knowledge, is the key to intelligence.

Perhaps the purest expression of this is in pedagogy of math. Piaget and Brissiaud are some of the scholars who I think really grokked this. They brought forward the concept of compression as essential for learning. E.g. if you have learned but not yet compressed the operation of addition, you will have a very hard time learning multiplication. After addition has been compressed such that you don't have to spend any effort doing it, you are ready to learn multiplication.

Maybe it would be worthwile to explore if there are AI techniques that can learn simple tasks first, then use that knowledge to solve harder tasks?

gleenn · on Dec 14, 2019

Tranfer learning is a form of this, where you train the first half of a neural net over a huge set of data, then take a much more specific task and train just a bit more by adding some layers, and you get great sometimes industry leading results. Look up Google's BERT project, trained generically on English texts and then you can reuse that part in many different tasks like sentiment analysis or entity recognition or translation. An amazing way to leverage Googles compute and data access and do cool things even as a single developer.

bitL · on Dec 15, 2019

> as a single developer

I think you are out of luck already. Even with transfer learning, retraining BERT to fit a larger dataset takes ages even on an RTX 8000. A single state-of-art experiment/training run at Facebook now costs 7 figures. The age of individual SOTA ML researcher/developer is over.

raducu · on Dec 15, 2019

This is so sad, actually, and It's not just ML, it is every kind of new tech and discovery.

To me it shows we have already picked the easy fruits and technology will not continue to evolve exponentially.

2muchcoffeeman · on Dec 15, 2019

> In a strong sense, this implies ML is not "intelligent" in any conventional sense of the word.

I don’t know why people have so much faith in ML and AI. Considering what a CPU actually does, shouldn’t the default position be one of skepticism?

It’d be truly amazing if you can make a machine intelligent in the human sense by merely coming up with a clever algorithm.

DavidSJ · on Dec 15, 2019

Or we could flip your question around:

Considering what a neuron actually does, shouldn't the default position be that the brain is a computer?

macintux · on Dec 15, 2019

How certain are we of what neurons and other building blocks of the brain actually do?

friendlybus · on Dec 15, 2019

Every advanced tech has been used as a metaphor for how the mind works. People used to think the brain worked like a piston engine. Now people think the brain is like a computer. Some people also think we are a network ala the internet of biological persons. Lots of artists think in images. Some people see the brain as an evolving story. The people who have any idea how the brain works are those that have been studying it for decades, I would be interested in those metaphors.

2muchcoffeeman · on Dec 15, 2019

No, because you’re being reductive. Biological organisms are clearly not like silicon computers.

Maybe biologist come up with the next big break through. I mean, if you want to match an organically grown organism ...

warent · on Dec 15, 2019

Is it clear that they're not alike? In the end it's all information and data being processed, whether by silicon or neural cell. What difference does the medium make to the end result?

lukastr0 · on Dec 15, 2019

Then at the very least, you should be able to simulate a biological brain, neuron by neuron, and get an intelligent computer that way. If you agree that's possible, you also have to consider whether it's the most efficient approach -- plausibly special algorithms could be invented that get the same intelligence at a fraction of the computing power.

friendlybus · on Dec 15, 2019

The conversation around artificial intelligence is more of a call to action for human intelligence to figure out what we need to do and how wield ML than it is a description of the fundamentals of the computing. Even if the results are mediocre in the extreme, you can bunch together a huge amount of social and business change under that banner. I'd argue that's why people have faith in AI not because of what a computer can do, but because it is a call to faith in humans ourselves to integrate the technology and all implicated changes to it's maximum potential.

garmaine · on Dec 15, 2019

They do build on previous knowledge. That’s literally what the training is doing.

It’s just easier, engineering wise, to start over from scratch on each iteration. And it makes it easier to evaluate how each change improves the outcome or not.

icebraining · on Dec 14, 2019

But they do build on previous knowledge (that's the training part), it's just the such knowledge is built from scratch, rather than taught by others.

joe_the_user · on Dec 14, 2019

A human can learn something through a fairly small amount of specific knowledge via written, verbal, etc instruction.

All the data and the training process together are very clunky in comparison.

IE, an inability to be taught by others is a serious limitation to say the least.

ben_w · on Dec 14, 2019

Perhaps we can learn from small data, perhaps that’s an illusion. How much of our mind is hard-wired by our genes, our genes being a very slow-learning form of intelligence?

Or, when we see “one” example of a new animal (for example), do we see a single picture? Or do we see it moving for about a second? Effectively multiple images? And, having previously learned a general model for physics, stereoscopic vision, and lighting, does that one second of motion gives us the size, shape, and a guess at the musculature of the animal?

If you strip those advantages, how many training examples do we need? The examples of people blind from birth or from extreme youth who regain vision in adulthood[0] implies that we learn a lot in our youth about how to see, which we can share between recognition tasks we learn later. Keeping those basic skills — avoiding “catastrophic forgetting” — still seems to be a rare and advanced feature in AI, but it’s not impossible.

There is still a lot we don’t know about how our own minds work. We might be closer than we realise, or much further.

[0] https://en.m.wikipedia.org/wiki/Mike_May_(skier) amongst others. I never did find the name of the guy who learned to see from touching a monkey statue…

joe_the_user · on Dec 15, 2019

It's certainly worth keeping in mind the limitations that you describe and human's learning activity is certainly not as the rationalist intuition. But to take these limitation as entirely characterizing human activity seems bend the stick too far the other way and fly in the face of the many ways humans have modified their environment and adapted to those modifications and fairly often learned with only a few examples. And sure, plausibly a fair portion of those quick adaptation involve reworking basic facilities like vision. The whole thing is consistent with the neocortex sitting on top of the rest of brain and serving as a means to rewire it.

topmonk · on Dec 15, 2019

> A human can learn something through a fairly small amount of specific knowledge via written, verbal, etc instruction.

This demands the mastery of language which is a very high bar. Most animals haven't achieved this, so I think if you expect to get a computer to learn this way before being able to reason as well before you consider it having intelligence, you would also have consider most animals not having intelligence as well.

josefx · on Dec 15, 2019

> This demands the mastery of language which is a very high bar. Most animals haven't achieved this

Then avoid language as medium. Many animals can learn from being shown an action once and apply it to themselves.

semi-extrinsic · on Dec 14, 2019

But it's not transferable. If you teach the AI to play checkers, you don't use that knowledge when trying to teach it chess.

imtringued · on Dec 14, 2019

>The trend in machine learning right now is to try and solve problems "from scratch" by training "end-to-end", which means without any prior knowledge.

That's not a very useful goal. I can trivially make a game that is impossible for such an AI to win. It would require you to enter a 1024 bit code [0] and if you are correct you win. For a speed runner the time it takes to beat the game is merely limited by their typing speed because the code is always the same but an AI with no prior knowledge can only brute force the solution. How does the speed runner win in the first place? They ask the game developer and then post the code on a wiki.

[0] an euphemism for minecrafts crafting system

YeGoblynQueenne · on Dec 14, 2019

The idea is that such end-to-end techniques can be used to tackle problems for which we don't have background knowledge. For example, (according to DeepMind's latest paper on MuZero) real-world robotics tasks etc. Atari games are a good use-case for this because they dont' have very clearly defined rules like board games have, so a model basically has to first learn to play the game before it can learn to play it well.

You can perhaps design a game that is difficult to solve by AI (though I wouldn't bet on it). The question with AI is usually not what it can't do but what it _can_ do. If a certain system fails at your task but performs well in a range of other tasks that are considered useful and challenging, then it's worth consideration. Especially so if there is no system that can perform well on that one task.

Jasper_ · on Dec 14, 2019

What if you had 22 hours of footage showing speedrunners entering the 1024 bit code to train on?

chmod775 · on Dec 14, 2019

It would be possible to do that, but it's not really what was asked.

The way you'd train a neural network with that footage is comparing the recording of the neural network playing the game to the recording of the speedrunner, and scoring it based on the difference in the footage.

However it wouldn't really be learning "from scratch" and "end-to-end" anymore, since you're essentially teaching it to imitate and play like a human player. Also your network would likely be just blindly executing a series of learned actions, without any real understanding of the game. If you confronted it from some situation that's not part of some speedrun it memorized, it would probably be at a loss. Pretty useless.

pishpash · on Dec 14, 2019

How about it watches how the human obtains the secret number (1024) in this game, then next time extrapolates the process to a different game with a different secret number? That is the kind of learning that humans do and AI's can conceivably do the same.

chmod775 · on Dec 14, 2019

That's just not how neural networks learn. They generally can't "watch" others to learn.

In the example above, at no point the neural network would be watching the video footage. The reference footage is just used for (automatically) scoring how well it does during training. It's not fed into the network itself in any shape or form.

ramblerman · on Dec 16, 2019

You are confusing the goal and the method.

Setting up a stupid method of getting to the goal doesn't disqualify said goal.

raverbashing · on Dec 14, 2019

But there is a lot of "common knowledge" that's useful

All the problems solved by fine-tuning image net for example, instead of training from zero

Identifying objects in a 3D environment is useful across several domains and it shouldn't be limited to Minecraft

codegladiator · on Dec 14, 2019

> with only 4 days of training data

so 4 days to train their AI and not 4 days of training data.. on the data which was "60 million frames of recorded human player data"

TheRealPomax · on Dec 15, 2019

My bad: they got 2 weeks of footage. Other than a clerical error, of course, this changes nothing about the fact that that's nowhere near enough data unless you've already cracked general AI.

allovernow · on Dec 14, 2019

>It's a weird thing to somehow think that an AI with only 4 days of training data should somehow outperform even a human child, which has a decade or more of imitation learning experience.

Well...not so much if you consider that machines are orders of magnitude faster than humans in any other task, including processing and recording data, a.k.a. learning.

TheRealPomax · on Dec 15, 2019

Not really, no. Procesing and recording data is not "learning". Machines are incredibly much faster at _classifying_ but that's been our problem for decades now: we figured out how to make insanely efficient and unreasonably good classifiers. However, we haven't been really able to turn that into efficient "learning", the best we've been able to do is what Deep Mind does, which requires very power machines working with literally decades worth of data.

ars · on Dec 15, 2019

> with only 4 days of training data

You misread it. It's around 300 hours of training data. It's 4 days to absorb those 300 hours.

I bet a human child who has never even seen a computer could learn to do this in far less than 300 hours. I suspect even 1 hour would be enough.

elif · on Dec 14, 2019

60 models trained on one dataset as well.

gcpwnd · on Dec 14, 2019

Now we just deny the prophecies of the upcoming singularity?

CydeWeys · on Dec 14, 2019

Interestingly, when I started playing the game, this task was essentially impossible for human players without consulting external resources, because the crafting recipes were unlisted, and you needed to know the specific recipes for various things like crafting tables, pick-axes, and shovels that would be nearly impossible to discover by randomly trying to guess recipes.

I don't know if there's a starting tutorial in the game now that explains some basic crafting recipes to get started with (there better be!), but unless the AI can understand it, or you're explicitly programming in the recipes as data, good luck.

6gvONxR4sf7o · on Dec 14, 2019

The challenge is to do this via imitation learning, so it gets to watch others find diamonds first.

SquareWheel · on Dec 14, 2019

Minecraft has had a native recipe book for some time. It displays crafting recipes as you progress through the game.

usepgp · on Dec 15, 2019

That feature has only existed for <20% of the games lifetime. It is more than likely that the person you're replying to played the game before the recipe book existed.

SquareWheel · on Dec 15, 2019

Yes, they said as much. I'm just letting them know that the feature they asked about is now in the game.

willis936 · on Dec 14, 2019

If the challenge was to train an AI to be able to read a recipe book that would be more impressive imo. Even more impressive would be an AI that could gather the necessary resources. It would be a full game playing AI. An imitation AI to get diamonds is neat, but the fact that it is considered feasible on one GPU in one day speaks to the difficulty of the problem more than anything imo.

SquareWheel · on Dec 14, 2019

It may be cheating, and sort of against the point, but those recipes are stored as json files. Makes things a lot easier.

willis936 · on Dec 14, 2019

Allowing access to the data changes the problem. It’s an interesting problem in its own right, but seems much easier to accomplish and arguably much less valuable than limiting input to images and output to mouse/keyboard. Teaching computers to interact with the world is the trick that we want to teach.

SquareWheel · on Dec 15, 2019

That's my thinking as well. It's much more interesting to see how the AI copes with the world, rather than getting bogged down in OCR and other previously-solved problems.

CrazyStat · on Dec 14, 2019

It would be cheating. The contest rules [1] specifically rule out using domain knowledge.

[1] https://gitlab.aicrowd.com/minerl/minerl-resources/blob/mast...

imtringued · on Dec 14, 2019

The humans finding diamonds in 20 minutes are cheaters by that logic.

CrazyStat · on Dec 14, 2019

Yes, a human competing in the AI challenge would be cheating.

lerouxmlr · on Dec 14, 2019

would it really be cheating? I mean if you get a competitive edge by utilizing complex logic, would it be fair ?

moomin · on Dec 14, 2019

The resources the programs can use are well specified. A human child is not part of the specified equipment.

CrazyStat · on Dec 14, 2019

Complex logic would almost certainly involve human domain knowledge which is disallowed.

CrazyStat · on Dec 14, 2019

They had 60 million frames of video (about 277 hours, assuming 60fps) to show to the computer for it to learn how. I think a human player could pick it up in that time.

asdfman123 · on Dec 14, 2019

Minecraft was hugely successful early on specifically because it didn't have to explain itself -- you had to do your research to figure out the game. And when you died you sometimes lost everything.

I guess they needed to add tutorials to better accomodate the wave of younger people who picked up the game in the mid 10's, but I actually appreciate Notch's decisions (or series of happy accidents that culminated in a highly addictive game).

CydeWeys · on Dec 15, 2019

It really did have to explain itself though. It was frustrating trying to play without spoilers because it was basically impossible. You had to go read guides saying how to make a crafting table, how to make tools, etc.

jvanderbot · on Dec 14, 2019

Right. Theres tons of prior dependencies that could be used with regular old planning and scheduling. Deciding what you need right now is actually trivial with that stock knowledge, and ML might help with where or how to get it, etc.

But I dont think the problem is to get Minecraft agents that can make diamond swords ... it's to use only ML to make agents, which means throwing away all that. That's fine.

Erlich_Bachman · on Dec 14, 2019

> Entrants were only allowed to use a single graphics processing unit (GPU) and four days of training time.

Alphastar SCII bot has been using much more resources and time than this to train, so maybe there is one of the reasons no entrant has achieved the goal yet.

> A relatively small Minecraft dataset, with 60 million frames of recorded human player data, was also made available to entrants to train their systems.

It will be interesting to see if these artificial and somewhat arbitrary constraints (although I get that the idea is to restrain it to resources that are somewhat realistically available to a single individual without organizational backing today) will either cripple this challenge or in the end yield some innovative results because the entrants will have to devise algorithms that use much less data and resources than what has been traditionally required to get SOTA results.

Additionally it is also unclear whether the way humans learn to play this game is actually using a smaller or a much bigger dataset to learn from. Sure, a human can learn to play it in 20 minutes, but that's after 9-10 years of other pretraining of seing, understanding and operating in the 3D physical world performing various tasks, getting compressed knowledge from other people by watching and listening to them... Maybe that would be an interesting challenge - to still constrain the final model to 1 GPU for 1 day, but at least allow the model to pretrain on arbitrary similar data, if it is not sourced directly from minecraft or any clones.

TulliusCicero · on Dec 14, 2019

> Alphastar SCII bot has been using much more resources and time than this to train, so maybe there is one of the reasons no entrant has achieved the goal yet.

It's probably worthwhile to note that AlphaStar is trying to become as skilled as the strongest human players, whereas in this case it's more of a binary "is capable of getting diamonds" thing, they don't need to be world-class diamond miners.

mkagenius · on Dec 14, 2019

> single graphics processing unit (GPU) and four days of training time.

Maybe there is a reason they had this restriction? I can only think of allowing the winning AI to be ready for end users whic h generally have a single GPU?

CrazyStat · on Dec 14, 2019

The resources devoted to training don't really translate to the resources needed to run the trained model. A model trained on a thousand GPUS for a week might still run in real time on a single GPU once trained.

The restricted training resources are just part of the challenge. They point out that a human child can learn the necessary steps in minutes by watching someone else do it, so they wanted to see if anyone could make a computer learn it with relatively limited resources.

coddle-hark · on Dec 14, 2019

A level playing field is good for a challenge like this. Those that could solve it without these constraints are pushed to improve their existing approach. Those that don’t normally compete in these kind of challenges aren’t put off by a potential competitor having way more resources.

neltnerb · on Dec 14, 2019

Imposing constraints forces people to be more creative in their approach. This seems like it was explicitly a goal of the challenge, to find more clever ways to solve problems without just relying on more data.

They're just biasing competitors towards efficient solutions.

Iv · on Dec 15, 2019

I like the constraints they put. But realistically, if you want to mimick the way humans learn, you want to put in place some kind of transfer learning: You have a model that has already been trained to understand video frames, maybe has played a few video games before. Then, 60M frames of minetest can be used to understand things about the game.

But also, the machine learning algorithms we have nowadays are in many ways superhuman. Seeing how much can be learned with these constraints is interesting as well.

SiempreViernes · on Dec 14, 2019

I guess the idea is that you come with a pretrained model that you then teach minecraft.

y1tan · on Dec 14, 2019

Goes to contest rules : https://gitlab.aicrowd.com/minerl/minerl-resources/blob/mast... Search 'source' :

- must submit source code - must submit source code - must submit source code

There's your problem right there. It's not that 'creators' are stumped. It's that, given how much it would be worth pitching the same closed source to an investment group or solution seeker directly, no 'creator' with a sufficiently advanced 'new' or capable system would ever submit their source code to one of these competitions. This goes for all competitions that require participants to submit their source code. It's what you you seek or your backers after-all which is valued far more than the potential prize you're doling out. Thus, your business model. The question is always, will someone who can develop such an advanced system be dumb enough to part ways with their IP for such low value? I think not.

So, you can pretty much throw any conclusions made from any such competitions in the trash. This goes for even bigger ones by bigger names. You're going to get what you'd intelligently expect : a small sampling of the same ol' same ol' approach. Don't expect any novel submissions. Don't expect any surprises. So, what's the point of this? I'm speaking about the whole site 'aicrowd' btw and any other group that organizes a 'submit your code' competition ...

krisroadruck · on Dec 15, 2019

Is it screwed up that after I saw the whole "20 minutes" thing I immediately fired up Minecraft for the first time in 18 months to make sure I could still do it that fast? (18 minutes and change in case you are curious)

marcelsalathe · on Dec 14, 2019

Direct link to the challenge: https://www.aicrowd.com/challenges/neurips-2019-minerl-compe...

soup10 · on Dec 14, 2019

Title is misleading. Traditional AI with subroutines for each subtask would have no problem with this.

unoti · on Dec 15, 2019

> Title is misleading. Traditional AI with subroutines for each subtask would have no problem with this.

The easiest way to code this with traditional methods would be to still use deep learning for the image recognition part of it. The input to the agent at each step includes an array of numbers representing the pixels on the screen. So it doesn't get to see the 3d terrain directly; it sees a projection into 2d, and needs to recognize the 3d terrain and objects. And from there it needs to synthesize that into a map of the situation-- for example, the presence of lava, water, cliffs, and hostile monsters.

Doing that kind of object recognition without deep learning would be pretty onerous. The strategy and planning parts of this, though, I agree would be comparatively straightforward. Further, even if it started with a representation of the actual terrain blocks in a 3d model, the synthesis step into a representation of what the overall situation is-- that also is much more easily done with deep learning than with traditional methods.

Back in the day I wrote a highly effective bot for playing Star Wars Galaxies which would run missions unattended. It was created exclusively with hand-coded routines and a big nested state machine. It included a hand-written OCR library for reading the compass coordinates and accepting missions from the mission terminal in a particular direction. It knew a pre-recorded path for navigating in and out of town to and from the mission terminal. It used the little red dots on the radar scanner to determine the presence of nearby hostile creatures and mission targets. I ended up selling this system to a gold farming outfit in China; it was only available for sale to the public for a handful of days.

I bring this up because this bot for SWG didn't contain any modern machine learning or deep learning techniques, and it worked great. But if it weren't for the coordinates displayed on the radar (used for navigating through the maze of town back to the mission terminal), along with easily discernible red dots on the radar indicating enemy presence (used for knowing which direction to face during combat), I'd have been at a loss for the complex object recognition / situation recognition necessary to turn this into a task solvable by a straightforward nested state machine.

johnmoberg · on Dec 14, 2019

The challenge was to accomplish this without domain knowledge... if you can do this with GOFAI, please go ahead and shock the world.

progval · on Dec 14, 2019

You have to use at least some domain knowledge, else you would be making a general-purpose artificial intelligence. The rules actually say: "without relying heavily on human domain knowledge" (emphasis mine).

The rule is actually quite vague, which is not very surprising, as it seems quite hard to define what domain knowledge is allowed and what isn't without having lots of loopholes.

johnmoberg · on Dec 14, 2019

I agree that is vague. But there are plenty of RL algorithms that learn with essentially no domain knowledge, e.g., MuZero[1] which doesn't even use a model. That doesn't make it AGI -- it only masters a single game at a time and we currently don't know how to transfer that knowledge to other domains.

[1] https://arxiv.org/abs/1911.08265

milesvp · on Dec 14, 2019

Was my thought as well. I’m pretty sure a stupid AI that was no more than a couple state machines and a lookup table could collect diamonds.

I’ve written bots for MMO’s and the hardest part for a task like this is fighting the API the bot has access to. I suspect I could even solve this challenge 10% of the time with a blind bot that just did some brownian motion. Unless you never run into trees, every other resources is pretty easy to just dig down for.

The only thing that would make this hard is the thing that makes it hard for a human player which is knowing you’re on the diamond level without using the xyz debugging info.

unoti · on Dec 15, 2019

> The only thing that would make this hard is the thing that makes it hard for a human player which is knowing you’re on the diamond level without using the xyz debugging info.

Something else that makes this much harder than it is for a human: there's no sound! You can't hear the nearby lava or water when digging down.

Check out the information in the observation space-- nothing about sound in here.

http://minerl.io/docs/environments/index.html#minerlobtaindi...

But maybe a computer could do a better job than a human at looking at the clouds before digging, estimating how high they are, then keeping track of how many levels down we've dug. I'm pretty sure the clouds are at a fixed altitude.

twright · on Dec 14, 2019

> hard for a human player which is knowing you’re on the diamond level

That’s a really good point, lava pools start at y=10 so mining around 11 is both safe and in the diamond zone. The question is if there’s enough training data to reinforce an agent so that once it finds a (underground) lava pool it should start mining at that level.

opportune · on Dec 14, 2019

Yeah, the steps of how you get from the base state to diamonds maps directly to classical planning problems. Deep learning would at best be an augmentation to this

Buttons840 · on Dec 14, 2019

Y'all got any of them AIs for turned based strategy games?

Seriously, I'm starting to think turn based strategy games like Civilization, etc, will be the last to get any attention. Why? We need good AIs for these games and they're not yet solved as far as I know. Furthermore, you don't have to model the human interface by limiting actions per second like with DotA or StarCraft. Yes, we solved Go, but turn based strategy games on PC are more complicated. Seems like a worthy area to research.

Does anyone know of any work on games like these?

catalogia · on Dec 15, 2019

>Seriously, I'm starting to think turn based strategy games like Civilization, etc, will be the last to get any attention. Why?

I don't think the answer is technical. A good AI for a game like Civilization would have knowledge of human social concepts like honor and vindictiveness, which are both concepts problematic to pin down. The definitions of concepts like these are fluid, they change with the times and societies in question. But that's not to say any possible setting for these concepts is equally valid when trying to emulate human behavior. Some settings will seem unrealistic, like a mustache twirling villain or a hyper-rational pointy-eared alien. Neither make for good AIs if you're shooting for human-like behavior.

Humans can certainly individually craft fictional personalities that seem realistic; authors do it all the time. But from a gameplay perspective that tactic falls flat; you end up with a limited set of personalities the player becomes familiar with and the game consequently loses relay value. In the Civilization games, every player knows that Gandhi has a short temper and likes to launch nukes. Rather than crafting individual personalities, how does a game designer define a function that returns personalities with a realistic distribution? How can a game designer define such a function if the parameters aren't truly known by science and the artists who craft individual personalities are going off wishy-washy metrics like gut instinct and artistic intuition?

Voloskaya · on Dec 15, 2019

> A good AI for a game like Civilization would have knowledge of human social concepts like honor and vindictiveness,

This is a pretty wild assumption to say the least.

catalogia · on Dec 15, 2019

I'm very curious why you're so skeptical of this. The AI that was created for the Civilization games did in fact try to model these and similar aspects of human behavior, with the intent of creating AIs that played in ways humans would consider natural or fitting. They named the AIs after historical humans and tried to make the AIs act in ways that would be perceived as similar to their namesakes. Nuke-happy Gandhi was originally a bug; that AI was intended to be a pacifist but apparently the in-game invention of democracy causes that value to underflow, rendering him as maximally aggressive. This trait was kept in later games due to fan demand, but that's the exception to the rule. Generally these AIs are intended to behave as humans might behave.

These games have parameters for loyalty, forgiveness, propensity towards warmongering, and more. That's part of what these games are. If you took away that philosophy towards AI design, you'd no longer have a Civilization game.

oefrha · on Dec 15, 2019

It's pretty damn easy to pin down "parameters for loyalty, forgiveness, propensity towards warmongering, and more" (in fact, they already pinned them down) while making the AI smart enough to not leave lucrative resources unimproved, and not smash waves and waves of units into a walled city with +30 strength.

catalogia · on Dec 15, 2019

The problem is that not all combinations of those traits are equally valid, and the optimal distributions of those traits are ill defined. If warmongering is some number 0 to 1, with 0 being Gandhi and 1 being Genghis Khan, that's all fine and good. But what does the distribution curve of a randomly generated nation leader look like and how well does that curve conform to the expectations and perceptions of the player? You can't assume independence; players likely expect a high-warmonger/low-forgiveness leader to be more common than a high-warmonger/high-forgiveness leader, but the later combination shouldn't be strictly forbidden. Rather than pinning down all these curves, the Civilization series leans on a finite number of hand-crafted AI players, created by humans exercising intuition and instinct.

As for tactical AI (which is only one component of Civilization), an optimal play from the AI probably isn't what most players desire, nor is optimal play with randomly imposed inefficiencies. Real humans don't run countries optimally, so an AI that runs a Civilization nation optimally won't ring true. You want an AI that rarely makes mistakes that a human wouldn't make, but frequently makes mistakes that a human would make.

Making a good AI for a game like Civ is not nearly as straight forward as creating an AI for a game like Chess or Go because players have different expectations from the games.

oefrha · on Dec 15, 2019

> As for tactical AI (which is only one component of Civilization), an optimal play from the AI probably isn't what most players desire, nor is optimal play with randomly imposed inefficiencies.

No one's demanding optimal play from AI. Meanwhile I'm sure pretty much no one's happy with idiotic AI, which is what we have at the moment. I already gave an example: "smashing waves and waves of units into a walled city with +30 strength." Any human player familiar with the rules won't sink 100 city-turns worth of production over 20 turns into attacking an unconquerable city, losing everything while shaving 20% off the wall, then repeat that for another 40 turns; that's idiotic, and that's exactly how the AI behaves alarmingly often. Do you enjoy dealing with this kind of behavior? I bet you don't, unless you only derive pleasure from crushing AIs (which I do enjoy, but the satisfaction from the lack of challenge only lasts so long).

Currently game difficulty in Civ is basically defined by how much of a head start AI players have and how much they cheat (which in theory is a consistent advantage throughout the game but in reality matters less and less once the human player starts conquering), so once you overcome your early disadvantages the late game becomes boring.

> Making a good AI for a game like Civ is not nearly as straight forward...

I never said it's easy to create a great AI for Civ. However, it shouldn't be hard to outdo the existing, idiotic AI by a huge margin (while preserving all the traits like nuke-happiness) if, say, DeepMind decides to put some resources into it.

catalogia · on Dec 15, 2019

Making an AI that avoids engagements with unwinnable odds is fine and all, but I would expect that to be a relatively simple matter that doesn't flex the strengths of modern approaches to AI. In fact looking back on my extensive albeit dated experience playing Civ2, I'm pretty sure the AI does in fact prefer engagements in which the odds favor it. If anything, the biggest gripe I had with it is the AI would often employ sound but tedious tactics, moving several dozen units per turn which, although not forbidden by the rules, isn't a lot of fun for the human to wait through. That is a trickier matter, since some humans do in fact play like that, so perhaps the AI should when the human does, but shouldn't when the human doesn't.

I don't think difficulty in analyzing the odds of engagements like that is the reason good AI for Civilization is elusive.

Voloskaya · on Dec 15, 2019

An AI can act in any way you want without having 'knowledge' of human concepts. AlphaStar can play in a variety of style, from defensive to very agressive, and yet has no understanding of concepts or values.

catalogia · on Dec 15, 2019

I think I see the issue, 'knowledge' and 'understanding' are loaded terms. The AI needs to behave as though it understands these things [if it's to be fun], but does not need to in fact 'understand' or 'know' these things.

Voloskaya · on Dec 16, 2019

Exactly yes.

sullyj3 · on Dec 15, 2019

> if you're shooting for human-like behavior

You're not though, you're just trying to build a bot that can win the game.

catalogia · on Dec 15, 2019

Not if you want a fun game. Civilization is a game about role-playing as the leader of a nation, not merely about sprinting to the finish. The AI is meant to take part in that roleplaying, to enhance the experience for the player. An AI that consistently loses but does so in a very human way is better than an AI that consistently wins but plays like a machine.

(Obviously the ideal is an AI that plays like a human and can hold its own, but a playstyle similar to humans is definitely more important than ability to win.)

Buttons840 · on Dec 16, 2019

I disagree and have a hard time enjoying opponents that I know are cheating to enable thmselves to compete.

After I learned the basics my own opinion is that Civilization is no longer a fun game, due to the poor AI.

We can probably agree that finding fun and interesting ways to dumb down an AI will be a cool area for research and new ideas in the future, but we need strong AI before we can figure out fun ways to weaken it.

sullyj3 · on Dec 19, 2019

For sure, but the context of this discussion is AI research, not game design.

Buttons840 · on Dec 15, 2019

Yep. Just beat a human 1v1 half the time. No AI can do it - none I've seen.

simias · on Dec 14, 2019

I'm a bit surprised that anybody is surprised by this lack of success, that seems like a hugely complicated task given the limited amount of time and resources. The amount of possible scenarios the AI needs to be able to handle (fall damage, environmental hazards, block types behaving differently, lighting, day/night cycles and of course enemies) plus the extremely open-ended gameplay where you can almost always dig in every possible direction while collecting and crafting potentially hundreds of items seems to make it a very arduous task.

cwyers · on Dec 14, 2019

Yeah. Look at how complex something like AlphaGo is, and then compare the complexity of Minecraft to Go. They had four days and one GPU. They never had a chance.

Tempest1981 · on Dec 15, 2019

From what I hear, diamonds are located at random, at Y levels 0-15. The terrain is usually at Y level 50-60. So digging is required. The best strategy is to dig on levels 12-15, to avoid lava pools which are on 0-11.

Not sure if the AI can toggle the debug menu, which shows the Y level, or if it can see the Y level during training.

Is this a job well suited to AI, vs a canned algorithm?

Tempest1981 · on Dec 26, 2019

Visualization of ore probability by depth: https://reddit.com/r/dataisbeautiful/comments/efvgve/where_i...

Observations: https://reddit.com/r/dataisbeautiful/comments/efvgve/_/fc2p6...

king07828 · on Dec 14, 2019

My google-fu is failing. I see this and copies of this article, but I want to know more about the I/O available to the AI. Did the AI get full access to the Minecraft API or were they limited audio, video, keyboard, mouse like a human player?

CrazyStat · on Dec 14, 2019

The documentation [1] has details. For input they get an RGB image (i.e. what a player would see on the screen, minus the UI) and other relevant information (e.g. inventory). For output they have to specify from a list of possible actions that they can take: move forward, move back, craft an item, etc.

[1] http://minerl.io/docs/tutorials/index.html

atorodius · on Dec 17, 2019

One thing I was thinking when reading this is that minecraft is visually based on the real world. A human opening it for the first time immediately knows that a tree is a tree. Seeing someone else then cutting it down allows the brain to do a lot of connections to tree-related concepts.

narrator · on Dec 14, 2019

Sounds like the R/L problems they had and solved with Montezuma's Revenge.

uj8efdkjfdshf · on Dec 14, 2019

I take it no one's submitted Baritone [0] yet? Or would that be against the spirit of the contest?

[0] https://github.com/cabaletta/baritone

CrazyStat · on Dec 14, 2019

That would be against the letter of the challenge, not to mention the spirit. From the rules [1]:

> The submission must train a machine learning model without relying heavily on human domain knowledge. A manually specified policy may not be used as a component of this model. Likewise, the reward function may not be changed (shaped) based on manually engineered, hard-coded functions of the state. For example, though a learned hierarchical controller is permitted, meta-controllers may not choose between two policies based on a manually specified function of the state, such as whether the agent has a certain item in its inventory. Similarly, additional rewards for approaching tree-like objects are not permitted, but rewards for encountering novel states (“curiosity rewards”) are permitted.

[1] https://gitlab.aicrowd.com/minerl/minerl-resources/blob/mast...

datameta · on Dec 15, 2019

That's ~11k hours of training data

tus88 · on Dec 14, 2019

> Finding a diamond in Minecraft takes many steps - from cutting trees, to making tools, to exploring caves and actually finding a diamond.

Sounds like a simple state machine. Don't they program objectives into their agents? Or are they relying purely on training?

asdfman123 · on Dec 14, 2019

Yes, programming it as a state machine is straightforward and there's already mods that do this.

I was working on something that would automate MC a while back, too, and I programmed it as a state machine that would loop through various tasks, which consist of subtasks. When you have a goal (like gathering diamonds), that's just a list of tasks -- and then each subtask consists of a list of tasks too, until you break it down to the "point at block", "walk to block", "swing hammer" level, etc.

It was fun but I am pathologically bad at finishing projects that I start.

blt · on Dec 15, 2019

The agent only observes the first-person view in pixels, a compass angle, and a dictionary of inventory items [1]. So any solution would need to have a significant computer vision component.

[1] http://minerl.io/docs/tutorials/first_agent.html#taking-acti...

CrazyStat · on Dec 14, 2019

Purely training. For example, the rules give as an example of something specifically disallowed telling your agent what a minecraft tree looks like--it has to learn that.

tus88 · on Dec 14, 2019

But humans get to RTFM. Why are denying the autonomous agents the same right? It is this kind of oppression that will lead the machines to rise up and overthrow the human race.

berrynice · on Dec 14, 2019

Minecraft is a game that is played by consulting manuals and learning from others, I would be astonished if an AI could magically discover how to find diamonds.

detaro · on Dec 14, 2019

Fairly sure a human would "magically" figure it out with the 11 days of gameplay footage the AI got ;)