Hacker News new | past | comments | ask | show | jobs | submit login
Reprojecting the Perseverance landing footage onto satellite imagery (matthewearl.github.io)
633 points by bmease on March 14, 2021 | hide | past | favorite | 37 comments



An interesting side point is that the graph optimization approach used here is somewhat similar to modern graph-based visual SLAM

The graph in the article can be seen as a factor graph. VSLAM systems usually have a (kind of bipartite) factor graph with vertices/variables that are either keyframes or ‘good’ features, with edges/factors between features and the frames that see them and between adjacent frames; each of these edges are the factors in the graph. This structure results in very large but sparse graphs, and there are factor graph optimization libraries that take advantage of this sparsity (e.g. g2o or GTSAM.) These libraries also use specialized optimization techniques for some of the nonlinear manifolds (e.g. SO(3)) that arise in SLAM problems.


Not only impressively coded but a beautiful result as well fascinating to have that real time, frame by frame comparison. Great job!

It only reinforces that I really need to learn my matrix math.


Plug: Just wrote a piece on this topic that you might enjoy reading: https://www.heinrichhartmann.com/posts/2021-03-08-rank-decom...


Maybe they only uploaded a 1080p version of the video - but I was expecting higher def. ...then again, I suppose interplanetary bandwidths are probably not great.


Pix4D made a 3D reconstruction:

https://www.youtube.com/watch?v=20wCGKTpOJw


Fascinating and well explained!

Reminded me of the video tracking work these folks do: https://forensic-architecture.org/


Wow. Thanks for sharing.

30+ years ago, I had friends reconstructing crime scenes for court proceedings. Architectural drawings and 3D scenes. They used AutoCAD and AutoSolid (?). Showing stuff like blood and ballistics.

Super effective. They turned my stomach.

I don't have words for these Forensic Architecture recreations. I almost feel like that I'm there (present).

I can only imagine their future VR recreations will be overpowering.


These are outstanding!


It'd be pretty neat to lift this up into 3d - you could probably reverse the transforms to find the camera pose for each frame, then drop it into a scene alongside the camera frustum and the topography so we can see exactly how much steering the descent stage did to hit its target and how fast it was descending at every stage.


On the other hand, I'd fully expect NASA/JPL to have IMU telemetry from the EDL. If I'm not completely mistaken, they would eventually get published in PDS here: https://naif.jpl.nasa.gov/naif/data.html

For example for MSL (Curiosity, the previous rover) the EDL CK and SPK provides orientation and position data if I'm interpreting this description right: https://naif.jpl.nasa.gov/pub/naif/pds/data/msl-m-spice-6-v1...

The downside being that it'll take probably 6-12 months until the data is put to public

(EDL = entry, descent, landing; IMU = inertial measurement unit; PDS = planetary data system)


I’ve been thinking more about the navigation of their little helicopter.

On earth were used to being able to use GPS for route planning. If you could use this process in reverse to constantly determine ones position in 3D space above the surface using stored satellite imagery with a downward facing camera cross referenced with whatever gyro / accelerometer based positioning they’re using I wonder if there’d be any benefit. Maybe what they’ve got already is sufficient for anything you’d want to do in the near future.


> If you could use this process in reverse to constantly determine ones position in 3D space above the surface using stored satellite imagery with a downward facing camera cross referenced with whatever gyro / accelerometer based positioning they’re using I wonder if there’d be any benefit

That is pretty much exactly how TRN worked for the EDL. I don't think Ingenuity has much in terms of navigation ability, probably just basic INS. But its also not intended to fly any extended distances, so it doesn't really need any navigation abilities. I'd imagine future copters would use TRN style navigation.

(TRN = terrain relative navigation)


The graph based approach is interesting... But I wonder if better and far simpler results might be had by simply using a few iterations of optical flow to perfect the alignment of each frame starting from the alignment of the previous frame?

As a benefit, the transformation could use images after being projected onto a deformable mesh to model the hills etc.


Pretty sure this is optical flow


I love that you can see the approach angle in the distortion of the field. It also helps to convey how thin the atmosphere is to see how long it takes for that to square up.


I've done this kind of stuff through a point and click UI in GIS software. It's really cool seeing a lot of the underlying math and concepts laid out like this.


I'm curious - what did you do, in what software, and how?


ESRI software has had this raster function for quite a while, at least 20 years. Usually 2 or 3 points would suffice. Using hundreds of points was unnecessary.


Hundreds of points lets you get a good average. 2 or 3 requires that you've definitely clicked the same point on both images; a human can use other bits of the image to work that out, but a computer finds it harder.


I think what he is referring to is georeferencing and reprojecting, you can read how it works in e.g. QGIS here https://docs.qgis.org/3.16/en/docs/user_manual/working_with_...


Georeferencing processes of various complexity in ArcGIS, PCI Geomatica, and ENVI.


Very impressive.

A next step could be to leave the already projected images where they are, and only draw over them, while marking the latest frame with a border. Eventually use frame sections which cover multiple frames to perform multi-frame superresolution.


Beautiful!

Extra kudos to the author for not calling the work done in Torch "learning".


Did Scott Manley do something similar with the Change 5 landing footage on the Moon.

https://youtu.be/lwLPzU8H3HI


FYI open cv has gradient based image alignment built in, findTransformECC.

https://docs.opencv.org/3.4/dc/d6b/group__video__track.html#...


OP should try out SuperGlue for features instead of SIFT


Excellent post! I wonder why SIFT didn't find sufficient keypoints early on, it's typically a beast of a method for such a task. It looks like there's some intensity variation, the satellite image is darker, but I'm not sure that would explain it all.


The SIFT algorithm discards low contrast keypoints. In the beginning the surface looks quite blurry (it seems the camera is auto-focusing the heat shield) which probably causes only low quality keypoints to be found on the surface. Additionally, if the algorithm also capped the maximum number of keypoints per image, the situation gets even worse, because strong keypoints on the heat shield (which had to be discarded "manually" later) compete against weak keypoints on the surface.


I've been having trouble finding the answer to this. How close to it's intended target did it land?

Thanks


5 meters. However, the "intended target" is not simply defined.

The landing ellipse for Perseverance was 7.7km by 6.6km. The goal is to land at a safe spot within the ellipse rather than land at a specific location.

The new Terrain Relative Navigation capability determines the rovers position relative to the surface during descent by comparing camera images to onboard satellite imagery. On Earth you'd use GPS. No GPS on Mars.

Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.


> Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

To add a bit more info, poorly remembered from this excellent We Martians episode[0] interviewing Swati Mohan, who is the Mars 2020 Guidance, Navigation and Controls Operations Lead and was the voice of the landing. Go listen to it!

On the way down an image is taken. Using data about how the atmospheric entry is going, and with a lot of constraints that include the hazard map and what kinds of manoeuvres are possible with the descent system (in particular it does a divert and there are minimum and maximum distances the divert must lie between), a single pixel is chosen from that image to aim for. That pixel represents a 10m x 10m square, and the rover landed with 5m of that square.

The hazard map is created from images with a 1m x 1m resolution, from one of the orbiters (Mars Reconnaissance Orbiter I think). Those images are scaled down for the hazard map, as the on-board image processing had very tight bounds on how long it could search for a valid landing site. The podcast goes into some cool detail about that whole system and its technical design.

0: https://wemartians.com/podcasts/94-guiding-perseverance-to-t...


There is an obvious case where you can't rely on GPS on Earth.

Pershing-2 missiles had radar correlation guidance back in the 80's.

An obvious consequence of Google maps imagery and open source is that a capable college student can make an optical terminal guidance unit out of a mobile phone.


You can see the map of where it landed and it's path moving here: https://mars.nasa.gov/mars2020/mission/where-is-the-rover/

The yellow oval is the target landing zone, though it looks like it's a bit too tall on this map compared to other sources.

You can see it's landing targets within the oval here: https://www.jpl.nasa.gov/images/jezeros-hazard-map

So it looks like it landed a little over 1km from the center of the oval, if that's your question.

When precisely talking about space travel, things tend to be discussed as "nominal" instead of being on target or correct. This is because some variance is expected, and systems are designed to work successfully within that variance. In that sense, Perseverance landed within the landing oval and on a safe landing spot, so it was 0 meters away from target.

An analogy would be it hit the bullseye and got the points, even it if it wasn't exactly in the middle of the dart board.


If you look at the last final seconds to the left of the landing you can make out an ancient river delta. That is one of the prime targets they want to investigate.


Or perhaps more importantly, did the terrain navigation software correctly choose an optimal landing location? It seems like it chose one of the rockiest places.


I was designed to have an error under 40m. I don't what it accomplished.

https://arstechnica.com/science/2019/10/heres-an-example-of-...


interesting! couldn't you do it with blender's tracking without python? although it's much more impressive with python.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: