In the areas you have visited previously you have two estimates of your position one from your frame-to-frame estimates and another from the map you built of the area the first time. You can then solve an optimization problem to bring those two estimates closer together.
In order to find out if you've already visited an area you store a description of the locations in a DB and search through them. The paper says they use a compressed representation of the "maps" and use test time training to optimize the global consistency between their sub maps.
> This is a reimplementation of LoGeR; complete code and models will be released upon approval.
I don't understand why it's a reimplementation either?
I would guess it's "research" code anyway so not really usable unless you are an expert.
Relocalisation is the bit thats surveillance-y. But its also crucial for accurate visual only navigation.
If you want reconstruction and training of robotic movement, this is far more appropriate. I believe we're going to see robots being able to "dream" in terms of analysing historical video information on spaces and improving movement and navigation.
So not mass surveilance, but probably there's a future of mass subjugation using robot enforcement.
I can imagine future iterations of this which bring together other stills of the same space at that time to augment the dataset. Then perhaps another pass to fill in gaps with likely missing content based on probability or data from say the same street 10 years later.
It won't be 100% real, but I think it'd be very cool to be able to have a google-street view style experience of areas before google street view existed.
Lidars are great, and getting smaller, but they still eat a lot of power. (The quest 3 had a lidar on the front[well structured light] and it was mostly not used)
For machines to understand the 3d world, first they need to extract geometry, then isolate those geometries into objects. This method is _a_ way to do that, the first step, extracting 3d points.
The problem with this model is that the points are not actually that well aligned frame to frame. This is why it looks a bit blurry. I assume this is to avoid running out of memory, as you're not quite sure about which points are relevant and need to be kept in memory.
Once you have those points, you need to replace them with simplfied geometry, so that you can workout intersections and junk.
Also, I am not sure how heavy LIDAR units are, but remember that the heavier the payload the more the flight time is reduced. Some drones can only have a single payload, so if you also want to capture (high-res) video/imgs you need to fly again.
It all depends on the use-case.
[0] https://arstechnica.com/gadgets/2017/09/googles-street-view-...
[1] https://en.wikipedia.org/wiki/Google_Street_View