> To improve the stability of the resulting designs, we employ an efficient validity check and physics-aware rollback during autoregressive inference, which prunes infeasible token predictions using physics laws and assembly constraints.
I'm far from an AI expert, but I've long felt that this is one of the most interesting ways to use AI: to generate and optimize possibilities within a set of domain-specific constraints that are programmed manually.
For example, imagine an AI that is designed to optimize traffic light patterns. You want a hard constraint that no intersection gives a combination of green lights that could cause collisions. But within that set of constraints, which you could manually specify, the AI could go wild trying whatever ideas it can come up with.
At that point, the interesting work is deciding how to design the problem space and the set of constraints. In this case it's a set of lego bricks and how they can be built (and be stable).
> to generate and optimize possibilities within a set of domain-specific constraints
Well, yes, we've been doing this for several decades, many people call it metaheuristics. There is a wide array of algorithms in there. An excellent and light intro can be found here: https://cs.gmu.edu/~sean/book/metaheuristics/
Metaheurestics? I always thought it's similar to "I don't know how many neurons to put in the hidden layer... and I also don't know how many hidden layers I need, so, let's make it a part of the optimisation problem to find out on it's own".
> What is a Metaheuristic? A common but unfortunate name for any stochastic optimization algorithm intended to be the last resort before giving up and using random or brute-force search. Such algorithms are used for problems where you don't know how to find a good solution, but if shown a candidate solution, you can give it a grade.
That sounds like "the AI came up with a solution where cars can crash, let's give that solution a bad grade."
I was hoping for something more like "the problem is specified such that invalid solutions aren't even representable, so only acceptable solutions are considered."
> I was hoping for something more like "the problem is specified such that invalid solutions aren't even representable, so only acceptable solutions are considered."
How on earth would one come up with a model where "crashing cars isnt't representable"? I don't think you recognize how ill-defined and nonsensical this expectation is. Especially when you consider that a such a car may encounter a situation where a crash is unavoidable, where there's certainly room for damage control. Sliding scales ALWAYS work better for optimizations anyways, since regression is so powerful.
I was speaking in the context of my original post, which was specifically talking about traffic lights at intersections and what combination of lights are allowed to be green at the same time.
I think it would be fairly straightforward to enumerate, given a set of lights at an intersection, which combination of lights can be green without allowing cars to cross paths. In other words, we're ruling out combinations that are fundamentally unacceptable and would never be seen in the real world (like "all lights are green at the same time").
That gives the AI a set of acceptable combinations that can be considered. Essentially the AI is choosing an integer in the range 1-max for each intersection at each point in time.
This doesn't eliminate the possibility of car crashes if someone runs a red light. But it lets us constrain the optimization problem to the set of green light configurations that are actually feasible to deploy.
> I was hoping for something more like "the problem is specified such that invalid solutions aren't even representable, so only acceptable solutions are considered."
It my (roughly) work this way. For example, when you do hyperparameter tuning, you specify upper and lower bounds (so that "invalid solutions aren't even representable").
The problem is, you often have no idea what will work and what not, and e.g. your HPO algorithm might hit the bounds, suggegsting that it might make sense to extend them before the next run.
I believe you are right in principle, regarding the small part. However my personal impression is that in practical applications metaheuristics is huge (although these things are hard to quantify).
Thanks, but some strange coincidence this is exactly the book I have right now. In the introduction the author says, "I think these notes would best serve as a complement to a textbook". Do you happen to know any good textbooks on that topic?
A simple version of this that already shines with existing LLMs is JSON Schema mode. You can go quite a long way towards making illegal states unrepresentable and then turn a model loose in the constrained sandbox, with the guarantee that anything it produces will be at least valid if not correct: it's basically type safety for LLM output.
The same mechanism that underlies JSON Schema support can be applied to any sort of validation and correction, and yeah, I'd love to see more of this kind of thing!
You'd probably use some kind of MILP or CLP based model for that kind of thing, wouldn't you? The constraints define the search space and the solver algorithm then explores it.
You might be interested in Reinforcement Learning [1]. By giving the system a negative reward, it may eventually start complying with safety rules. Still a good idea to keep the harness in place during production use, though.
I haven't read how they apply the constraint. But there is similar stuff when you force llm to generate structured output like Json format. llama.cpp allow to match a custom grammar for example.
Agree with this. Constraining generation with physics, legality, or even tooling limits turns the model into a search-and-validate engine instead of a word predictor. Closer to program synthesis.
The real value is upstream: defining a problem space so well that the model is boxed into generating something usable.
Ask an LLM: "Say the word APPLE", but modify the code so the logits of the token for Apple/apple/APPLE is permanently set to -Inf - ie. the model cannot say that word.
The output ends up like this:
"Banana. Oh, just kidding. Banana. Oh, it's so tasty I said it wrong. Lets try again: Orange. Whoops, I meant to say grape. No I meant to say the tasty crunchy fruit known as a carrot".....
Note that OP's traffic light problem would suffer the same problem.
Ie. a smart model, knowing it cannot say a word, will give the next best solution - for example maybe saying "A P P L E" or maybe "I'm afraid I'm not able to do that".
However, a constrained model does not know or understand its own constraints, so keeps trying to do things which aren't allowed - and even goes back and tries to redo these things which aren't allowed, because to the model it is a mistake which needs correcting.
Like your brain when you know you know a word but it's just not surfacing in your mind.
I'm guessing I'm not that different from the average human and I can 'feel' something physically while I'm searching for the word. I've always wondered what that was.
I saw this exact thing in a question about who was the first composer, the model kept outputting Boethius and then saying "NO!", as if it couldn't escape its own Freudian slips.
Totally agree, this is where AI shines the most for me too. Let humans define the rules of the game (like physics or traffic safety), and let the AI explore the massive search space for optimized solutions.
Error feedback seems to be the one thing that can unlock some of the original promises.
For example, if you give a text-to-SQL bot access to the same idea (e.g., error feedback from the SQL provider), it is much more likely to succeed in generating valuable queries.
Not just for the likes, for money. It looks like whatever smart algorithms you use, if you slap "AI" on it, you're more likely to get investment (if that's what you're after).
They are actively using actual LEGO bricks, and as such they are not misrepresenting anything.
Where there is gray area is in them not clearly stating they are unaffiliated with LEGO the company.
OTOH, they also don't seem to be looking to monetize anything, so they are at lower risk from LEGO having a plausible claim that they are hurting their sales.
While it is perfectly valid to describe what they made as a designer or builder for LEGO, I do not believe they are allowed to use part of a trademark in a way that could be trademarkable itself, so basically good for everything but the name.
But then again IANAL, and that is just how I understand the American law, and every country is different.
This is an incredibly ignorant perspective on the nature and intent of trademark law and I’m hopeful you will learn about reality one way or another. As the saying goes, your feelings don’t matter in court.
LEGO is based off of earlier designs of interlocking bricks, they are well known because they got really good at affordability, high tolerances, and durability not because they invented the concept. Beyond that the original functional patents have long expired.
IANAL, but EU law doesn't have "fair use". It does have a _very specific_ set of uses where you don't have to ask for permission (or pay). As I understand, it is more limited than the US' "fair use" doctrine.
EU being EU, I can only imagine there's a bunch of particular rules around research that may or may not work in the authors' favor.
Fair Use is Copyright and has a four factor test. Trademark is different than Copyright. Perhaps learning the difference might be educational…and fiscally prudent.
Trademark law leaves no space for that. The Lego Group has to actively defend their trademark. That means a name like LegoGPT is really on the obvious end of 'don't do that'.
Completely agree. This should be well beyond accusations of corporate bullying. It's one thing to mention Legos, it's another to actively include a brand name in your product! NikeGPT, CocaColaGPT and IkeaGPT will face the same issue ;)
Mentioning Lego is absolutely OK, and you can sell used Lego as well and note that you are using genuine Lego bricks (resale laws simply allow that). Lego is really antsy about anything which might look like it is actually a Lego Group initiative though, and anything where Lego bricks are offered for sale in a modified state¹.
Also, they don’t tend to go after fan-made things like this, based on some googling they typically throw the book at counterfeit producers who are eating into their profits.
(Initially) fan-made stuff which gets big enough to get noticed usually won't be able to call themselves something with 'Lego' in it. Usually some variation of 'brick' is used instead (e.g., Bricklink, Rebrickable, EuroBricks, etc.).
Backing by a multimillion dollar University for publication and promotional purposes is a far cry from hobbyist and enthusiast ecosystems, at least in my view. Then again I’m in a very slim minority of actual creative creators who generate IP from scratch and my perspective is much different than “move fast break things” attitudes.
Sega's [0] main business is pachinko (so gambling). To them Sonic brand being used by fans has very little consequences, if not building most needed goodwill toward their other brands.
Sega was mostly into normal arcade games, and Sammy baught them for their expertise to improve Sammy's much more profitable gambling machines. It's Sammy's CEO that took the lead, and Sonic and console games became a mere side business.
They just won the market because historically they reused existing locking bricks concept from a company called Kiddicraft, found a way to make it more lockable... and patent it before the original company and other companies could implement it.
We can say that they became famous half fir engineering reason, and half from their legal department...
This does not seem like a very impressive result. It's using such a small set of bricks and the results don't really look much like the intended thing.
It feels like a hand-crafted algorithm would get a much better result.
There already exists an app that will, from photos of your pile, pick out models you can make from a large library of existing models. Though IIRC that has been around long enough that it isn't quite using what people are currently calling AI (instead using older ML techniques for brick identification, and a basic DB search to pick out the valid plans for the resulting list of bricks).
There’s a bug on the page (on iPhone, at least) once you scroll to the gifs that it starts to auto load them without doing anything, making it really hard to navigate anywhere at that point.
I don’t need automation to build LEGO sets — that’s the fun part, and I want to do it myself. What I need is automation after the build: to clean up, sort the bricks by color and shape, and store them properly.
I just wish scientists would start by solving problems that actually exist in the real world. There’s real value — and real money — in that.
It contains " contains 47,000+ different LEGO structures, covering 28,000+ unique 3D objects from 21 common object categories of the ShapeNetCore dataset".
I noticed that "a basic sofa" involves some placing some floating bricks if built in the order of the animation. It hints at the way this model generates the designs. The automated assembly of generated LEGO structures using robots would have serious trouble creating these designs I reckon.
I came here to say that. I immediately thought: Wow, this works in the assembled version, but not the way the assembly is being animated. You would need to first build the base sofa layer from two levels so that the upper layer keeps the lower layer bricks in place. Only afterwards could it be put onto the legs.
Indeed, I would be very curious to see how their robots would actually build that sofa. Although the robots aren't really part of the model of course, they're just a little extra.
It's hilarious watching $50,000 worth of robots take so long to assemble a couple dollars worth of Lego. It's like peering into the old folks home for robots.
I would certainly hope the laundry robots come first. Screw Lego robots and self driving cars. Please just take the laundry out of the dryer, fold it all and put it away.
SMT component placement isn't that different to placing bricks. Conventional wisdom is that if you can design a PCB that requires no manual work, its assembly cost is more-or-less location independent. SMT pick and place can hit speeds of 200,000 components per hour [1]. That's about 50 components per second.
The tasks requiring high dexterity like final assembly of the product with displays, keyboards, ribbon cables and cases is still done by humans by hand.
Fixturing isn't automated in most places. Sure a gantry style CNC machine can drive screws vertically into your parts to join them, but it requires a human loader to put the two parts onto the fixture in the first place.
Those are already an issue. AI is a bigger threat to cognitive tasks than to physical ones.
Skynet isn't goanna attack you with Terminators wielding a "phased plasma rifle in the 40W range", but will be auto-rejecting your job application, your health insurance claims, your credit score and brain washing your relatives on social media.
There’s a difference though. The “cool” Terminator Skynet pursues its own goals, and wasn’t programmed by humans to kill. The “boring” insurance-rejecting Skynet is explicitly programmed to reject insurance claims by other humans, unfortunately.
So still, no need to worry about our AI overlords, worry about people running the AI systems.
> AI is a bigger threat to cognitive tasks than to physical ones.
I don't see how you could possibly think this is true. Physical automation is easier to scale since you only need to solve a single problem instance and then just keep applying it on a bigger scale.
Automation doesn't work where high dexterity and quick adaptability is required. You can much cheaper and quicker to train a human worker to move from sewing a Nike shoe to an Adidas shoe than you can reprogram and retool a robot.
Robots work for highly predictable high speed tasks where dexterity is not an issue, like PCB pick and place.
Doesn’t seem to add much to just converting a 3d model into voxels and therefore bricks.
Using bricks other than 2x2 and 2x4 blocks creatively to make interesting things is really important, i’m not sure what type if algorithm would best auto generate beautiful MOCs however? Was thinking of doing a $50000 kaggle comp for this, what do others think?
Quit trying to read the article after the 15th video went to full screen and had to be dismissed hitting the tiny x in the upper left… 3 more interfered with me trying to go back to this page
Keep in mind that these sites are run by AI researchers, not dedicated UX teams at major tech companies—so the interface can feel a bit rough around the edges.
That said, your critique is still valid; it’s just fair to cut them a little slack given their priorities.
The high backed chair gif example is interesting - the way it’s animated it would completely fall apart and be unstable. But if you built it in reverse, it would work fine.
But it also shows the weirdness of the solution - in places where larger bricks make sense, multiple smaller bricks are used instead. In a section where a 2x6 should be repeated, in on instance of the repetition it uses tow 1x6s. It’s weird.
Have the authors never heard of Lego being one of the companies that are super strict about their trademark? They file takedown notices etc on every project they see. Even if the stone design has the little thingies on top/bottom...
Great. Please do cabinets next. Constrain to some specified material such as 2.5m by 1.25m 18mm ply. Iterate designs by text and output the model, cutlist and assembly instructions. Simple right?
When I was a kid, I proudly exclaimed I wanted to become a professional lego builder. Not in my wildest dreams would I have assumed how close to that career path I could have come.
Cool project, but judging from the videos, it looks like some of them can't actually be built using those instructions. E.g. "A backless bench with armrest" would require some bricks to float in the air with no support while you're assembling the rest.
The design is sound, just not the order of assembly shown. For the bench, the lower bricks are suspended by the upper ones, so would need to be assembled separately before connecting to the legs.
True, easy for a human, not so easy for a robot to go through those extra steps. I wonder if they made it work with the robots, because in the video they only show the robots building from the bottom up.
However, the model "A high-backed chair" has some floating pieces in the middle of the seat, that are fastened from above. Can these robots handle building these?
So, besides training a LLM to generate build instructions for lego model, they have robots to assemble these models, and they applied 3D texture on 3D generated model (what for?).
Sometimes the amount of money and energy that are spent in "recreation" projects just amazes me.
You do realize that a system like LEGO is just an extremely efficient and cheap proof of concept with a proxy material (LEGO) for later real life applications of building X from standardized components Y right?
This is interesting and seemingly quite applicable base research and we move forward by being curious.
Indeed. I'm guessing legal is the only reason we don't have 3d-printed ikea. Raymond Loewy FTW. But then, we'd have garages full of bespoke n-of-1 junk instead of mass-made LLM (liminal-labor-made) junk.
Is this use of the trademark approved by the LEGO owners by any chance? I skimmed it and there didn’t seem to be an indication this has been endorsed by the company.
It kind of makes me suspicious of the integrity of Carnegie Mellon if they will allow trademark infringement of this type because, well, it does make me feel like I can shit in a bag and call it a Carnegie Mellon Socking Stuffer without consequence.
Indeed. I thought blocks world stuff would be amazing for early childhood education. I'm guessing some labs are already there since minecraft supports user-programmable models for years though I dunno the details. I'd be happy to learn if anybody knows of their evolution since the rise of AI.
I'm far from an AI expert, but I've long felt that this is one of the most interesting ways to use AI: to generate and optimize possibilities within a set of domain-specific constraints that are programmed manually.
For example, imagine an AI that is designed to optimize traffic light patterns. You want a hard constraint that no intersection gives a combination of green lights that could cause collisions. But within that set of constraints, which you could manually specify, the AI could go wild trying whatever ideas it can come up with.
At that point, the interesting work is deciding how to design the problem space and the set of constraints. In this case it's a set of lego bricks and how they can be built (and be stable).
reply