top of page

Clarity of Vision

  • Writer: Kshitij Duraphe
    Kshitij Duraphe
  • Oct 23
  • 9 min read

Updated: Nov 10

Or Why Astronomers Hated Our AI, while Defense & Security Teams think it’s the biggest disruptor to come.


We began by looking up at the night sky, building AI to help astrophotographers and amateur astronomers see distant galaxies more clearly and efficiently. Our first model was designed to track objects in the night sky for extended periods while removing atmospheric disturbances.


The results were promising, as shown below:



Unfortunately, potential users wanted nothing to do with it. Meanwhile, security teams and the defense industry had different ideas about our models.


Astrophotographers, along with much of the space industry, turned out to have a deep mistrust of generative AI. When you are imaging light that has traveled millions of years through space, the last thing you want is an algorithm that might hallucinate features that aren't there. They were right to be skeptical. We were trying to solve the wrong problem.


Then we were approached by defense and security professionals, and what we found was extremely exciting. Security and defense teams were drowning in petabytes of video footage so degraded it might as well have been blank: dark warehouses, foggy intersections, rainy exteriors, drone feeds obscured by smoke. All of it recorded, all of it stored, and almost all of it useless. When investigators asked, “What happened in the parking lot between 2 and 4 a.m.?” the answer was often, “We have footage, but you can’t make out anything.”


Fast forward to today: Absentia Tech builds AI for bad video, deploying our models both on-device and in the cloud depending on post-processing needs.


The Elephant in the Surveillance Room


Most people think surveillance is about live monitoring: someone watching feeds, spotting threats in real time, and responding immediately. That happens, but it’s not the common case. The real challenge is post-incident investigation. Something went wrong—a theft, a missing person, an accident, an unauthorized access, or a drone sortie that needs analysis—and now you need to reconstruct what happened from whatever cameras were recording.


This is post-mission ISR: Intelligence, Surveillance, and Reconnaissance after the fact. Military units fly drones through contested environments, recording everything. Security teams deploy cameras to cover every angle of a facility. Autonomous vehicles continuously capture their surroundings. When incidents occur, investigators inherit hours or even days of footage, most of it degraded by darkness, weather, or atmospheric conditions. The footage exists. The evidence exists. But human eyes can’t extract anything useful—and even when they can, how much can they really see while drowning in massive volumes of visual data limited by environmental factors?


We spent months talking to security directors and military analysts. The same frustration surfaced again and again: they knew critical information was buried in their video archives, but finding it meant squinting at dark screens and foggy frames. Object detection algorithms failed because they require clean inputs. Traditional enhancement techniques hit physical limits; you can’t recover information that never reached the sensor.


This wasn’t a technology searching for a problem. These were users desperate for the application of our technology.



The Right Tool for Each Job


The astrophotographers’ concern applies to our work too. Are we making things up?


This question assumes we always use generative models. We don’t. Most of our enhancement pipeline relies on non-generative techniques such as learned denoising networks, physics-informed restoration, and multi-frame fusion algorithms. These methods don’t generate pixels from learned priors; they extract and amplify the signal that is actually present in the degraded input.


Generative models come into play only when appropriate. When recording video in near-darkness, photons still reach the sensor. The signal exists, though barely above the noise. Non-generative algorithms can recover much of that weak signal without inventing anything. However, there are cases where the signal is so degraded that deterministic approaches reach their fundamental limits. In those situations, generative techniques become useful, not as a default, but as a targeted tool.


When we do use generative models, the choice depends on constraints. For real-time processing, we use a GAN-based architecture that runs fast enough to achieve 30 frames per second on consumer NVIDIA hardware. GANs operate in a single pass: feed them a degraded frame and receive an enhanced one instantly. The quality ceiling is lower than with iterative methods, but the latency is predictable and acceptable for live monitoring.


For post-mission ISR, where investigators analyze recorded footage and time pressure is measured in hours rather than milliseconds, we can employ more sophisticated approaches. Certain types of degradation benefit from diffusion models or multi-stage generative pipelines that deliver higher-quality restoration at the cost of longer processing times. A few seconds per frame is acceptable when the alternative is declaring the footage unusable.


The key insight is that enhancement quality is not the only metric. Latency, compute cost, reliability, and downstream task performance all matter. AI-restored images, whether produced by generative or non-generative techniques, serve a single purpose: enabling object detection, segmentation, and tracking algorithms to function where they would otherwise fail.


A security officer might not be able to verify the accuracy of every pixel, but when the system identifies three people entering through a loading dock at 3:47 a.m. and tracks their movements through the facility—when investigators can see those enhanced frames clearly enough to corroborate other evidence—the value becomes clear. The restoration exists to make downstream analysis possible, not to produce gallery-quality imagery.


Our SPECTER model handles extreme low-light conditions using primarily non-generative restoration, with optional GAN-based refinement for real-time deployment. Feed it footage from a warehouse at night with minimal ambient light, and it recovers structural details and moving objects that are invisible to conventional processing. GHOST specializes in fog, smoke, and particulate obscurants, which are critical for analyzing incidents in adverse weather or emergency conditions. SPIRIT compensates for atmospheric distortion that degrades long-range surveillance and drone footage.


Each model targets specific degradation types because the physics of light loss differ. The optimal architecture differs. The acceptable latency differs. Some scenarios require real-time processing at 30 frames per second, while others can wait for higher-quality post-mission analysis. We built separate pipelines because no single architecture can satisfy both constraints.


Real-Time and Post-Mission: Different Problems


The engineering challenges for real-time versus post-mission processing are completely different.


Real-time means 30fps minimum, preferably on a single GPU that might be running other tasks simultaneously. You get one shot at each frame. No iterative refinement. No multi-pass algorithms. The GAN architecture we use for real-time generative enhancement is heavily optimized: pruned networks, quantization where possible, aggressive caching of intermediate features. We sacrifice some quality for guaranteed latency.


Post-mission analysis operates under different constraints. When investigators are reviewing recorded footage, processing at 2-5x realtime is acceptable. That opens up techniques that are too slow for live deployment. Multi-frame temporal fusion can pool information across dozens of frames to denoise more aggressively. Heavier models can run without breaking latency budgets. For cases where generative approaches add value, we can use more sophisticated architectures that iteratively refine outputs.


The technical principle is simple: optimize for your actual use case, not the theoretical ideal. Most academic research targets quality at any compute cost. We target adequate quality at practical compute costs for specific, high-value applications. Real-time monitoring and post-incident investigation have different definitions of "adequate" and "practical."


The Interface Matters As Much As The Underlying Models


Security teams don't want to scrub through enhanced footage frame by frame. They want answers to questions. "Show me everyone who entered the building through the south entrance." "Find vehicles that stopped near the fence line." "Track the person wearing dark clothing who appears in the loading area around midnight."


We built a query system that treats video investigation as a conversation. Natural language questions get parsed, mapped to detection and tracking tasks, executed across enhanced footage, and returned as timestamped results with visual evidence. Enhancement—whether non-generative or generative—runs as a preprocessing step, making the footage clean enough that object detection algorithms work reliably.


This is why the astronomical community's rejection led somewhere valuable. Astronomers need pixel-perfect accuracy for scientific measurement. Security and defense teams need reliable answers to investigative questions. The acceptable error modes are completely different. An astronomer can't use an image where subtle features might be hallucinated. An investigator can use enhanced footage where details are probabilistically inferred, as long as the object-level conclusions (person present, vehicle type, direction of movement) are reliable.


Our PHANTOM model finds barely-visible objects by combining weak signals across multiple frames with learned priors about object appearance. WRAITH does something different: predictive tracking. When an object disappears behind an obstruction or into deep shadow, WRAITH predicts trajectory based on motion history and scene context. Investigators need continuous tracks, not fragmented detections.


The engineering principle: design for the decision, not the data. What insights do users need to make conclusions based on this video? What confidence levels are required to make those decisions? We now build systems that deliver real insight and support those decisions.


The Trust Problem


Military and security customers need to know what's real versus what's generated. Operational decisions will depend on enhanced footage. False confidence in AI outputs could be disastrous to decision making or court investigation.


We address this fundamental issue through layered metadata and configurable transparency. Every frame carries provenance information: which model processed it, enhancement strength, input quality metrics, confidence scores for detected objects. Users can view original footage alongside enhanced versions. They can adjust enhancement intensity—trading clarity for certainty about authenticity.


For critical applications, we support conservative modes where enhancement is limited to physically-justifiable operations. You lose capability, but you gain defensibility. Different missions have different trust requirements.


Mystic makes it easy for users to access and measure confidence levels. Instead of manually sourcing this data, they can quickly see how confident the system is about its results, helping them make more informed decisions about the authenticity of the footage and the reliability of the analysis


This represents another engineering principle: don't hide your system's limitations. Build control surfaces that let users adjust the capability-confidence tradeoff for their specific context. Some investigations need maximum information extraction regardless of uncertainty. Others need courtroom-defensible evidence chains.



What We Actually Built


Our SHADE model handles mixed lighting conditions where parts of a scene are overexposed while others are in shadow. It’s especially useful for backlit situations and high-contrast environments.


We focused on ensuring that no critical incident goes unanalyzed simply because the footage was too degraded. A theft in a dark warehouse, a drone flying through smoke, or an accident in fog—these events leave traces in video that humans can’t see but properly designed AI can recover.


Post-incident investigation is where our innovations stand out, though our real-time capabilities also serve critical live-monitoring needs for select customers. Recorded footage, reviewed after the fact, helps when an incident takes place that someone needs to understand: military post-mission ISR, security incident review, accident reconstruction, legal discovery—any scenario where video exists but appears unusable.


The technical achievement lies in building an enhancement pipeline that delivers real-time performance when needed and superior quality for post-mission analysis when time allows. It chooses the right technique for each degradation type and deployment constraint. The product achievement is packaging that capability into a query interface that security and intelligence professionals can actually use.


We run entirely on NVIDIA hardware because nothing else comes close for this workload. Their GPUs, drivers, CUDA ecosystem, and TensorRT optimization toolkit make up a stack built precisely for this kind of intensive neural network inference. We tried alternatives. They weren’t viable.


What We Learned From Failure


Our pivot away from Astronomy taught us that having sophisticated technology doesn't mean you have a product. Astronomers had legitimate reasons to reject generative AI. Their requirements were incompatible with our approach. We could have spent years trying to overcome their objections or adapt our technology to meet their standards.


Instead, we looked for users whose problems matched our capabilities. Security and defense teams didn't care about pixel-perfect reconstruction. They cared about actionable intelligence from footage they'd otherwise abandon. Their problem was severe enough that even imperfect solutions had massive value.


Pivots are often dressed up as strategic foresight. Ours was a revelation. We built something valuable that our intended users didn't want. We found different users with different needs who desperately wanted something similar.


The technical principles that carried over are:

- Process video where signal is barely distinguishable from noise

- Extract maximum information from minimal input

- Run fast enough for practical workflows

- Present results in actionable formats


The business insight: find users whose constraints match your capabilities, not users whose ideal requirements match your vision.



The difference in visibility after a a video of cars driving at night is processed with ab AI agent(SPECTER) for low-light enhancement.


Where This Goes


Security teams have petabytes of degraded footage sitting in archives. Every parking lot incident, every perimeter breach, every unexplained event—recorded but not analyzed because analysis seemed futile. We're making that footage useful.


Downstream applications matter as much as enhancement. When object detection works reliably on restored video, you can build automated systems for pattern detection. Track vehicles across multiple camera feeds. Identify repeated behaviors. Flag anomalies. The enhancement is an enabling layer for higher-level intelligence.


We're not trying to replace human judgment. We're trying to give analysts enough information and insights so their judgment isn’t blinded. When a security director asks "What happened?" we want the answer to be based on evidence, not guesswork.



Mystic, an LLM layered with CNN AI agents, which help enhance video based on natural language prompts. Mystic summons purpose built AI agents for different types of visual enhancement. The difference in visibility after a a video of cars driving at night is processed with ab AI agent(SPECTER) for low-light enhancement.


*Absentia Tech builds AI systems for video enhancement and analysis. We process footage from drones, security cameras, dash-cams, and autonomous vehicles—both in real-time for live monitoring and post-mission for incident investigation. Everything runs on NVIDIA hardware.


Check out absentiatech.com to learn more or reach out to request a demo.


- Written by the Absentia Engineering Team

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.
bottom of page