I built a browser-based tool for detecting objects in satellite imagery using vision-language models (VLMs). You draw a polygon on the map and enter a text prompt such as "swimming pools", "oil tanks", or "buses". The system scans the selected area tile-by-tile and returns detections projected back onto the map as GeoJSON.
Pipeline: select area and zoom level, split the region into mercantile tiles, run each tile with the prompt through a VLM, convert predicted bounding boxes to geographic coordinates (WGS84), and render the results back on the map.
It works reasonably well for distinct structures in a zero-shot setting. occluded objects are still better handled by specialized detectors like YOLO models.
There is a public demo and no login required. I am mainly interested in feedback on detection quality, performance tradeoffs between VLMs and specialized detectors, and potential real-world use cases.
I’m thinking about adding new features. Which one is more useful and should come next: searching with an image of an object, detecting changes over time using Sentinel-2 data, or detecting object in all Google Street View images within a selected area?
One guy made a similar solution for our hackathon (airplane detection):
https://github.com/nabetse00/webnova_submision/blob/main/Pyt...
Very cool, been trying to make something like this as well (for very niche usecases). If i am on mobile the selection polygon max size seems to be very small, like the size of one block?
Tangent question, I know of services like Planet Labs, Maxar... is the capability there now assuming you had money, where you could tag a ship from space and watch it travel (I know there is something like ADSB for ships) but would be interesting.
What kind of image set is used here? A quick scan over the site and I don't know if I can provided my own aerial or satellite imagery or if you provide it.
very cool
Once I figured out how to use the UI I did 2 scans. first one I had to zoom in before the identification boxes popped up. At first I thought it didnt do anything
Second scan I put over a local aviation museum with a mix of helicopters, unusual planes, cars, buildings, and other equipment. I was surprised to see everything identified correctly, though it missed a single helicopter.
I'd love a little bell or notification when the scan completes, as I hit 'scan', switch to a different tab and then forgot I was waiting
I have been looking for a way to find pickleball courts near my house. This sounded like the perfect tool! I tried "pickle ball court" using the maximum polygon size (actually just slightly under) and... it didn't work. It gave me a red dashed square around a driveway to what looked like a farm.
Cool concept though.
Great idea, almost impossible to use on the mobile due to the mobile UI.
It successfully located a submerged car in a bayou. (Note, this is a known finding, and nothing reportable). So I think there are some possible positive use cases here. I'm curious what other unsolved mysteries are now solvable with computation.
This is cool. I'll give it a go
How should the search box respond to a search of Las Vegas, NV or San Antonio, TX? I'm not getting any response.