Seeing Through Interference: Building a Weather-Aware Vision System
- Kshitij Duraphe
- Nov 11
- 11 min read
Most computer vision models assume the world presents itself cleanly, but security cameras operating in uncontrolled environments tell a different story. They record footage degraded by rain, snow, fog, darkness, compression artifacts, and lens distortion, and these conditions appear unpredictably and often overlap in ways that make simple enhancement strategies fail. A camera monitoring a warehouse entrance at three in the morning during January might contend with low light, freezing fog, and motion blur from precipitation driven by wind, all simultaneously degrading the image in different ways. The footage exists, the camera records, but the image carries almost no information that a human or algorithm can utilize.
Typical enhancement systems treat degradation as a single undifferentiated problem—take a bad image and make it better. That approach works when you know what kind of degradation you face, but it fails when the system must operate autonomously across arbitrary conditions without human operators labeling each frame. A model trained to remove rain will struggle with fog, and a model tuned for darkness will mishandle atmospheric haze, sometimes reducing visibility instead of improving it. We needed something different. The system had to understand why an image looked degraded before deciding how to fix it, which meant building classification into the enhancement pipeline rather than treating it as a separate problem.
This post explains how we developed a system that detects five primary degradation types and routes each frame through the appropriate restoration model in real time, but more importantly it describes the insight that made the system practical: that weather and visibility change slowly enough that you don't need to classify every frame, just watch for transitions.
Understanding Degradation
Haze reduces contrast through light scattering, which happens when suspended particles in the atmosphere scatter photons before they reach the camera sensor. Distant objects fade toward the background color more than near objects because the farther light travels through haze, the more it scatters, creating a depth-dependent degradation that's spatially smooth with no sharp boundaries separating hazy regions from clear ones.
Snow introduces bright, irregular occlusions that vary in size, shape, position, and opacity from frame to frame, reflecting light sources and creating bright spots that obscure scene details in ways that change rapidly. Rain draws directional streaks across the image as fast-moving drops create motion blur in predictable directions determined by wind and camera angle, producing elongated patterns rather than the discrete spots you see with snow.
Darkness reduces dynamic range and increases noise because insufficient illumination means too few photons reach the sensor, dropping the signal-to-noise ratio to the point where scene details remain buried in sensor noise. Distortion arises from the imaging system itself rather than environmental conditions—compression artifacts from H.264 encoding create blocking patterns, cheap lenses introduce chromatic aberration and geometric warping, and rolling shutter effects distort fast motion.

The five canonical degradation types: haze, snow, rain, darkness, and distortion. Each affects image signal differently.
The problem is that these factors often overlap in real footage. A foggy night combines haze and darkness, rain during evening twilight mixes precipitation streaks with low light, and compression artifacts appear regardless of environmental conditions, so a single frame might exhibit three or four degradation types simultaneously. This overlap means a single restoration model cannot handle all cases without first understanding which degradations are present, because each type affects the underlying image signal differently.
The Early Model and Its Limits
We initially tried to train one deep network to restore all degraded frames regardless of condition, building a large architecture trained on examples of all five degradation types with the goal of simplicity: one model, one forward pass, one enhanced output. The universal model underperformed in ways that were instructive. Training was slow and unstable, the network overfit to whichever degradation type appeared most frequently in the training data, and when two or more degradations coexisted in a single frame the model struggled to decide which correction to prioritize, producing enhancement quality that varied unpredictably.
The model was also opaque in a way that made iteration difficult. When enhancement failed we could not easily determine why—was the classification implicit in the network incorrect, or was the restoration pathway inadequate? This experience led to a realization about the structure of the problem: enhancement cannot succeed without classification, which means that before applying any restoration the system must identify what type of degradation it faces.
Moving Toward Modular Reasoning
We reframed the architecture by splitting it into two stages instead of trying to do everything at once. The first stage determines what type of degradation is present, and the second stage applies a specialized enhancement model designed for that type, which improved explainability because each component had a clear responsibility. The classifier only needed to recognize conditions, not fix them, and the enhancement models only needed to handle one degradation type, which allowed smaller networks and faster inference.
Modularity also supported iteration in ways the monolithic approach never could. We could improve the rain removal model without touching the haze model, retrain the classifier without rebuilding the enhancement pipeline, and test each piece independently, which meant development velocity increased substantially. The modular approach introduced new challenges though—degradation types are not mutually exclusive, so we needed multi-label classification with probability scores rather than a single hard category.

Two-stage architecture: classify degradation type, then route to specialized enhancement model.
How Absentia Defines Degradation
The system uses five canonical categories—haze, snow, rain, darkness, and distortion—and these categories emerged from observation rather than theory. We reviewed thousands of frames from real security cameras deployed outdoors and found that most visibility failures fit into one of these five types, which balances coverage with specificity in a way that proved practical. Each category corresponds to a different interaction between light, air, and the camera sensor, so the physical causes differ and the correction strategies differ accordingly.
We treat these categories as the core vocabulary of the system, meaning an image is not generically bad but rather suffers primarily from haze or primarily from darkness or primarily from distortion. The classifier outputs five probabilities that sum to more than one because multiple types can coexist, and the routing logic interprets these probabilities to select the appropriate restoration pathway. This explicit taxonomy makes the system easier to reason about—engineers can ask why a particular frame routed to the darkness model, inspect the classifier's probability outputs, and verify that the decision matches their own assessment.
Extracting Reliable Cues
The classifier extracts lightweight features from each frame rather than relying entirely on learned representations, focusing on interpretable metrics that correspond to physical properties of degradation. Global brightness indicates darkness by computing mean luminance across the entire frame, where low values suggest insufficient illumination, and this feature costs almost nothing to compute while remaining stable under compression noise. Contrast and edge density reveal haze because atmospheric scattering lowers contrast globally and blurs edges, so a frame with low contrast and sparse edges likely suffers from fog or mist.
Directional coherence of edges distinguishes rain from snow by measuring whether edge orientations cluster around a dominant angle. Rain produces streaks aligned in a consistent direction while snow produces randomly scattered occlusions, so high directional coherence suggests rain and low coherence suggests snow. Temporal difference between consecutive frames quantifies precipitation activity because heavy rain or snow causes large pixel changes from one frame to the next as particle positions shift, while static haze causes minimal temporal variation.
Each feature captures one aspect of degradation and no single feature perfectly identifies a condition, but the classifier combines all features using a small neural network trained on labeled examples. We deliberately avoided complex preprocessing or expensive feature extraction because the goal was speed and transparency—engineers can visualize feature values for any frame and verify that they align with intuition, and if the classifier makes an unexpected decision we can inspect features to understand why.
The Temporal Stability Insight
Weather and visibility do not change at frame rate—they change gradually over seconds, minutes, or hours. If the system classifies a scene as hazy now the scene is almost certainly still hazy one second later, and since a camera recording at thirty frames per second captures thirty images in that one second, all thirty images experience the same haze. This observation transformed our approach to classification because we realized we do not need to classify every frame, and once the model establishes confidence in the current condition it can lock that state and skip classification for subsequent frames until evidence suggests a change.

Classification decisions remain stable for minutes while weather conditions persist. Red bars show reclassification events.
The classifier runs on short temporal windows rather than individual frames, so a window of three seconds provides ninety frames at thirty frames per second and the classifier pools evidence across all ninety frames to compute probability scores for each degradation type. After the classifier outputs stable probabilities the system enters a hold state where it continues routing frames to the selected enhancement model but stops running the classifier, relying instead on a lightweight monitoring process that tracks summary statistics like mean brightness and temporal difference energy.
This assumption holds empirically. We tested on hours of outdoor footage from security cameras under various weather conditions and found that classification rarely changed within intervals shorter than thirty seconds, with most conditions persisting for minutes before transitioning. The temporal stability optimization reduces computational load by roughly an order of magnitude because classification involves feature extraction and neural network inference, and running it at thirty frames per second consumes processing capacity that could be spent on enhancement.
Routing and Enhancement
Once classification stabilizes the system routes frames to the appropriate enhancement model, maintaining five specialized models corresponding to the five degradation types. The haze model performs contrast-aware dehazing by estimating transmission maps that describe how much light reaches the camera from each point in the scene, then inverts the atmospheric scattering process to recover attenuated colors. The rain model removes directional streaks using learned filters that distinguish precipitation patterns from structural edges, with frame-to-frame consistency helping because what appears as rain in one frame should shift position in the next.
The snow model denoises by averaging across frames where snowflake positions differ but scene structure remains constant, exploiting temporal fusion to cancel out transient occlusions while preserving static scene details. The darkness model amplifies weak signal while controlling noise by using learned priors about natural image statistics to separate actual scene information from sensor noise. The distortion model corrects compression artifacts and lens aberrations, learning to recognize and suppress these artifacts while preserving genuine scene edges.
When multiple degradations coexist routing becomes more complex and requires careful sequencing. If both rain and darkness receive high probabilities we apply models sequentially with darkness enhancement running first to lift signal above noise, then rain removal following to clean streaks from the brightened result, because removing rain from a dark image is harder than removing rain from an already-enhanced image. Each enhancement model is independently replaceable, which means modularity supports customization and continuous improvement in ways that a monolithic architecture never could.
Testing in Real Environments
We tested the complete pipeline using footage from security cameras deployed outdoors in locations with diverse weather and lighting conditions—urban intersections, warehouse perimeters, parking structures, and building entrances where each camera recorded continuously for weeks. Testing focused on three criteria: classification stability, inference latency, and perceptual improvement.
Classification proved stable in practice, with the system holding a single condition label for a median of eight minutes before transitioning to a different state. Latency stayed within budget with total processing remaining under thirty milliseconds per frame in stable conditions, and the system sustained thirty frames per second on a single NVIDIA RTX 3080 GPU at 1080p resolution. Enhanced frames scored consistently higher across all degradation types with the largest improvements appearing in darkness and haze conditions, and object detection algorithms showed significantly improved detection rates on enhanced footage.
Field testing revealed several unexpected findings. Distortion was more frequent than anticipated—even in good weather and lighting, footage showed compression artifacts severe enough to benefit from correction, so we began treating distortion as a near-default condition. Darkness and haze interact in ways that complicate classification because foggy nights are not simply darkness plus haze, with scattering from fog changing how artificial light sources behave, so we added contextual features to distinguish ambient darkness from patchy illumination in scattering conditions.
Lessons from Implementation
Building this system taught us several things about practical computer vision in uncontrolled environments. Robustness comes from understanding what needs to be computed and what can be left alone—we stopped classifying every frame not because we wanted to save compute but because repeated classification added no value since weather does not change that fast. Stability builds trust even when accuracy is imperfect, and operators preferred a system that made consistent decisions over one that produced slightly better enhancement but oscillated between models unpredictably.
Modular design encourages iteration in ways we came to appreciate during development. We updated individual enhancement models dozens of times while the classifier changed less frequently and routing logic evolved gradually, with each component able to improve independently. Interpretability supports debugging and refinement because feature-based classification gave us transparency into decisions, and when the system misclassified fog as haze we could inspect features to understand why.
Physical grounding improves both accuracy and maintainability because our taxonomy maps to real physical processes where haze is atmospheric scattering and rain is occlusion by falling water, making it easier to design features and enhancement strategies. Real-time constraints drive architectural choices in ways that improve design—the thirty-frame-per-second requirement forced us to find the simplest approach that worked well enough, and that constraint led to better design because simplicity under constraints often produces more robust systems than complexity with unlimited resources.
Conclusion
In practice, seeing clearly is less about perfect models and more about understanding when the world itself has changed, which is why by detecting conditions once and adapting intelligently we can focus computation where it matters most. The system we built combines straightforward techniques applied carefully to a specific operational problem—feature-based classification, modular enhancement, and temporal stability—and none of these ideas are novel individually. The engineering value comes from recognizing which assumptions hold in real deployments and designing around them.
Security cameras operate in environments we cannot control, and we cannot change the weather or install better lighting everywhere or replace every cheap camera with a high-end sensor. What we can do is build systems that recognize degradation for what it is, route intelligently based on that recognition, and extract maximum visibility from whatever signal reaches the sensor. The system does not make a camera see more than reality allows—it makes a camera see reality as it is through mathematical reversal of physical degradation processes.
When the world becomes unclear, the system should not panic; it should pause, observe, and adjust once. That principle guided development and reflects how we think computer vision should behave in real environments, not reacting to every frame as if conditions might have changed completely but watching for actual changes and responding appropriately when they occur.
The test of any engineering system is not whether it works in ideal conditions but whether it works when conditions are far from ideal. Security cameras face the worst conditions routinely—fog at dawn, rain at midnight, smoke from fires, dust from construction, bright sun creating harsh shadows and blown-out highlights, cheap sensors with heavy compression and mediocre lenses. Our system was designed for those conditions from the start rather than as an afterthought.
We did not build this system to demonstrate technical capability but because customers need it. Security teams have petabytes of degraded footage in archives and investigators ask what happened during a specific time window only to receive footage so poor that no meaningful analysis is possible, which wastes the investment in camera infrastructure and leaves questions unanswered. Making that footage visible has value—not perfect visibility or forensic-quality reconstruction, but just enough visibility that an investigator can determine what occurred, that an object detector can track movement across frames, that a security operator can identify whether something requires further attention.
Engineering is the process of making things work outside the laboratory where academic results on clean datasets matter but practical deployment matters more. This system represents our attempt to bridge the gap between research results and operational reliability, working well enough often enough that it produces value despite inevitable imperfections. We learned these lessons by building, testing, failing, and iterating rather than through abstract reasoning, with each iteration teaching us something about the gap between what we thought would work and what actually worked.
The relationship between classification accuracy and enhancement quality is not linear in ways that matter for deployment. A classifier that is ninety percent accurate does not produce enhancement that is ninety percent better, and a single misclassification at the wrong moment can route frames through an inappropriate model for seconds before correction occurs. Stability matters more than instantaneous accuracy for this reason—a classifier that is eighty-five percent accurate but stable produces better results than a classifier that is ninety-five percent accurate but oscillates between states, which we discovered only by watching the system operate over extended periods.
In the end, the system works. It makes degraded footage more useful, runs fast enough for real-time processing, and operates reliably without constant oversight, which is what matters—not whether it represents the most sophisticated possible approach but whether it solves the problem customers face. Absentia builds tools for when the world is not ideal, and this weather-aware vision system is one of those tools that handles rain, snow, fog, darkness, and distortion because those are the conditions that real cameras face every day.
Written by the Absentia Engineering Team


Comments