Reprojecting the Perseverance landing footage onto satellite imagery

antioedipus · on March 14, 2021

An interesting side point is that the graph optimization approach used here is somewhat similar to modern graph-based visual SLAM

The graph in the article can be seen as a factor graph. VSLAM systems usually have a (kind of bipartite) factor graph with vertices/variables that are either keyframes or ‘good’ features, with edges/factors between features and the frames that see them and between adjacent frames; each of these edges are the factors in the graph. This structure results in very large but sparse graphs, and there are factor graph optimization libraries that take advantage of this sparsity (e.g. g2o or GTSAM.) These libraries also use specialized optimization techniques for some of the nonlinear manifolds (e.g. SO(3)) that arise in SLAM problems.

crowbahr · on March 14, 2021

Not only impressively coded but a beautiful result as well fascinating to have that real time, frame by frame comparison. Great job!

It only reinforces that I really need to learn my matrix math.

heinrichhartman · on March 14, 2021

Plug: Just wrote a piece on this topic that you might enjoy reading: https://www.heinrichhartmann.com/posts/2021-03-08-rank-decom...

koheripbal · on March 14, 2021

Maybe they only uploaded a 1080p version of the video - but I was expecting higher def. ...then again, I suppose interplanetary bandwidths are probably not great.

snovv_crash · on March 14, 2021

Pix4D made a 3D reconstruction:

https://www.youtube.com/watch?v=20wCGKTpOJw

canada_dry · on March 14, 2021

Fascinating and well explained!

Reminded me of the video tracking work these folks do: https://forensic-architecture.org/

specialist · on March 15, 2021

Wow. Thanks for sharing.

30+ years ago, I had friends reconstructing crime scenes for court proceedings. Architectural drawings and 3D scenes. They used AutoCAD and AutoSolid (?). Showing stuff like blood and ballistics.

Super effective. They turned my stomach.

I don't have words for these Forensic Architecture recreations. I almost feel like that I'm there (present).

I can only imagine their future VR recreations will be overpowering.

ncmncm · on March 15, 2021

These are outstanding!

saulrh · on March 14, 2021

It'd be pretty neat to lift this up into 3d - you could probably reverse the transforms to find the camera pose for each frame, then drop it into a scene alongside the camera frustum and the topography so we can see exactly how much steering the descent stage did to hit its target and how fast it was descending at every stage.

zokier · on March 14, 2021

On the other hand, I'd fully expect NASA/JPL to have IMU telemetry from the EDL. If I'm not completely mistaken, they would eventually get published in PDS here: https://naif.jpl.nasa.gov/naif/data.html

For example for MSL (Curiosity, the previous rover) the EDL CK and SPK provides orientation and position data if I'm interpreting this description right: https://naif.jpl.nasa.gov/pub/naif/pds/data/msl-m-spice-6-v1...

The downside being that it'll take probably 6-12 months until the data is put to public

(EDL = entry, descent, landing; IMU = inertial measurement unit; PDS = planetary data system)

MrSourz · on March 15, 2021

I’ve been thinking more about the navigation of their little helicopter.

On earth were used to being able to use GPS for route planning. If you could use this process in reverse to constantly determine ones position in 3D space above the surface using stored satellite imagery with a downward facing camera cross referenced with whatever gyro / accelerometer based positioning they’re using I wonder if there’d be any benefit. Maybe what they’ve got already is sufficient for anything you’d want to do in the near future.

zokier · on March 16, 2021

> If you could use this process in reverse to constantly determine ones position in 3D space above the surface using stored satellite imagery with a downward facing camera cross referenced with whatever gyro / accelerometer based positioning they’re using I wonder if there’d be any benefit

That is pretty much exactly how TRN worked for the EDL. I don't think Ingenuity has much in terms of navigation ability, probably just basic INS. But its also not intended to fly any extended distances, so it doesn't really need any navigation abilities. I'd imagine future copters would use TRN style navigation.

(TRN = terrain relative navigation)

londons_explore · on March 14, 2021

The graph based approach is interesting... But I wonder if better and far simpler results might be had by simply using a few iterations of optical flow to perfect the alignment of each frame starting from the alignment of the previous frame?

As a benefit, the transformation could use images after being projected onto a deformable mesh to model the hills etc.

HotVector · on March 15, 2021

Pretty sure this is optical flow

jcims · on March 14, 2021

I love that you can see the approach angle in the distortion of the field. It also helps to convey how thin the atmosphere is to see how long it takes for that to square up.

Waterluvian · on March 14, 2021

I've done this kind of stuff through a point and click UI in GIS software. It's really cool seeing a lot of the underlying math and concepts laid out like this.

picture · on March 14, 2021

I'm curious - what did you do, in what software, and how?

ecommerceguy · on March 14, 2021

ESRI software has had this raster function for quite a while, at least 20 years. Usually 2 or 3 points would suffice. Using hundreds of points was unnecessary.

wizzwizz4 · on March 14, 2021

Hundreds of points lets you get a good average. 2 or 3 requires that you've definitely clicked the same point on both images; a human can use other bits of the image to work that out, but a computer finds it harder.

zokier · on March 14, 2021

I think what he is referring to is georeferencing and reprojecting, you can read how it works in e.g. QGIS here https://docs.qgis.org/3.16/en/docs/user_manual/working_with_...

Waterluvian · on March 14, 2021

Georeferencing processes of various complexity in ArcGIS, PCI Geomatica, and ENVI.

qwertox · on March 14, 2021

Very impressive.

A next step could be to leave the already projected images where they are, and only draw over them, while marking the latest frame with a border. Eventually use frame sections which cover multiple frames to perform multi-frame superresolution.

gspr · on March 14, 2021

Beautiful!

Extra kudos to the author for not calling the work done in Torch "learning".

publicola1990 · on March 14, 2021

Did Scott Manley do something similar with the Change 5 landing footage on the Moon.

https://youtu.be/lwLPzU8H3HI

0xfaded · on March 14, 2021

FYI open cv has gradient based image alignment built in, findTransformECC.

https://docs.opencv.org/3.4/dc/d6b/group__video__track.html#...

villgax · on March 14, 2021

OP should try out SuperGlue for features instead of SIFT

twright · on March 14, 2021

Excellent post! I wonder why SIFT didn't find sufficient keypoints early on, it's typically a beast of a method for such a task. It looks like there's some intensity variation, the satellite image is darker, but I'm not sure that would explain it all.

bertylicious · on March 14, 2021

The SIFT algorithm discards low contrast keypoints. In the beginning the surface looks quite blurry (it seems the camera is auto-focusing the heat shield) which probably causes only low quality keypoints to be found on the surface. Additionally, if the algorithm also capped the maximum number of keypoints per image, the situation gets even worse, because strong keypoints on the heat shield (which had to be discarded "manually" later) compete against weak keypoints on the surface.

milofeynman · on March 14, 2021

I've been having trouble finding the answer to this. How close to it's intended target did it land?

Thanks

rfdonnelly · on March 14, 2021

5 meters. However, the "intended target" is not simply defined.

The landing ellipse for Perseverance was 7.7km by 6.6km. The goal is to land at a safe spot within the ellipse rather than land at a specific location.

The new Terrain Relative Navigation capability determines the rovers position relative to the surface during descent by comparing camera images to onboard satellite imagery. On Earth you'd use GPS. No GPS on Mars.

Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

Cogito · on March 14, 2021

> Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

To add a bit more info, poorly remembered from this excellent We Martians episode[0] interviewing Swati Mohan, who is the Mars 2020 Guidance, Navigation and Controls Operations Lead and was the voice of the landing. Go listen to it!

On the way down an image is taken. Using data about how the atmospheric entry is going, and with a lot of constraints that include the hazard map and what kinds of manoeuvres are possible with the descent system (in particular it does a divert and there are minimum and maximum distances the divert must lie between), a single pixel is chosen from that image to aim for. That pixel represents a 10m x 10m square, and the rover landed with 5m of that square.

The hazard map is created from images with a 1m x 1m resolution, from one of the orbiters (Mars Reconnaissance Orbiter I think). Those images are scaled down for the hazard map, as the on-board image processing had very tight bounds on how long it could search for a valid landing site. The podcast goes into some cool detail about that whole system and its technical design.

0: https://wemartians.com/podcasts/94-guiding-perseverance-to-t...

jojobas · on March 14, 2021

There is an obvious case where you can't rely on GPS on Earth.

Pershing-2 missiles had radar correlation guidance back in the 80's.

An obvious consequence of Google maps imagery and open source is that a capable college student can make an optical terminal guidance unit out of a mobile phone.

Laremere · on March 14, 2021

You can see the map of where it landed and it's path moving here: https://mars.nasa.gov/mars2020/mission/where-is-the-rover/

The yellow oval is the target landing zone, though it looks like it's a bit too tall on this map compared to other sources.

You can see it's landing targets within the oval here: https://www.jpl.nasa.gov/images/jezeros-hazard-map

So it looks like it landed a little over 1km from the center of the oval, if that's your question.

When precisely talking about space travel, things tend to be discussed as "nominal" instead of being on target or correct. This is because some variance is expected, and systems are designed to work successfully within that variance. In that sense, Perseverance landed within the landing oval and on a safe landing spot, so it was 0 meters away from target.

An analogy would be it hit the bullseye and got the points, even it if it wasn't exactly in the middle of the dart board.

easton_s · on March 14, 2021

If you look at the last final seconds to the left of the landing you can make out an ancient river delta. That is one of the prime targets they want to investigate.

ralusek · on March 15, 2021

Or perhaps more importantly, did the terrain navigation software correctly choose an optimal landing location? It seems like it chose one of the rockiest places.

pkaye · on March 14, 2021

I was designed to have an error under 40m. I don't what it accomplished.

https://arstechnica.com/science/2019/10/heres-an-example-of-...

high_byte · on March 14, 2021

interesting! couldn't you do it with blender's tracking without python? although it's much more impressive with python.