the models aren't missing so much as just terrible. bing doesn't have detailed 3d representations of every building and geographical feature on earth, they're creating them algorithmically from aerial photography.
sometimes, that results in things like the sydney harbour bridge being rendered as a featureless rectangle.
> they're creating them algorithmically from aerial photography.
Google does the same thing. You used to be able to use SketchUp to model buildings for them, but, a few years ago, they switched to computer generated models. It was very obvious from the pointy trees and rough edges on buildings.
This is different than what Microsoft is doing. Microsoft is getting data from municipalities about how tall a building is, and then generating a generic building that's the right shape and as tall as the data says it is. You can see what that looks like in the article; look closely at the Melbourne cricket ground before the community fixed it, and it's just an oval building that looks like a generic multi-story apartment building. That's their algorithmic generation when they don't have 3D data from aerial photos. (There are also fun anomalies, like the world's tallest building being as wide as a single house because someone typo'd the number of floors in the official records.)
As far as I can tell, Google isn't doing that. They only have 3D buildings where there are aerial photos that can see the walls of the building, they then use that data to construct a 3D model.
Google doesnt use AI/ML for automagic photogrammetry, as a result they are really bad at things like balconies looking more like a glitch than a building feature. They also dont feed 3d models back into their maps. My building stopped existing on Google 2D map sometime 10 years ago despite having fairly accurate 3d model.
Balconies are simply about what the threshold for shape simplification is. It's just way too much data to represent a whole city and download it over Internet so somebody decided that buildings should look pixel-perfect from half a mile away, but getting closer one could see funny edges etc. so common for edge-collapsing 3D shape simplifications.
The problem is its not edge collapsing, its somewhere half in the middle producing triangle shaped protrusions with bad perspective texture on top. In this particular use case photogrammetry algorithm should have some hardcoded common sense to detect boxy building and try to maintain flat surfaces and straight lines. Instead it looks like Google picked one of the open source SLAM implementations and deployed it at scale without fine tuning :(
Look at 41 central park west train wreck. You have >20 good photos to work with, but the end result doesnt even maintain straight windows. Balconies look like Lara Croft breast from first Tomb Raider, not to mention they are all different geometry despite high quality source material showing them being all the same shape. The best data saving option would probably be no additional geometry at all, just a texture.
If you're into that sort of thing, OpenStreetMap supports 3D mapping as well. I haven't looked into it but while from my understanding it's fairly basic, at least you're sure that your data will be around in ten years and openly available for people to use and display.
Two differences: Google is way ahead, and they definitely have some quality control or manual editing happening on the algorithmically generated models - at least for higher profile buildings.
The flight simulator buildings are procedurally generated, guided by an AI. They probably went this route because it makes better looking buildings with fewer polygons and less disk space.
99% of the time it works great because you don't really care if a virtual house exactly matches the real house. You just want to see some realistic looking housing estates.
However the downside is when you go and see some well known landmark. They didn't write a procedural palace generation routine so when the AI sees Buckingham Palace it has to pick the closes "normal" building which is apparently an office block.
I suspect the best way to fix that would be to detect when the AI fails and fall back to Google Maps style scanning, which looks worse, but actually matches reality. You could also do landmark detection fairly easily - Bing Maps must have enough data about what people search for and take photos of.
Where "Google Maps style" 3D buildings and terrain are available, it's used in MSFS. It's just that there are many places without it, where the game only has a 2D satellite map to work with (plus some extra data that's available like number of floors in the buildings). In those cases it uses the AI generation.
It's a common trick to use building footprints for the basic shape, elevation data for the building height and prevailing colors for estimating roof and wall structure. It's still much better than having all buildings flat. They likely trained their ML algo on many different building photos from the orbit to output e.g. roof type, wall type etc.
Some data Bing uses comes from Open Street Maps. If its not in all the other sources of data for Microsoft and it lacks data in OSM then it will be missing.