| May 19, 2026 |
A compact metalens-based nanophotonic sensor mimics insect compound eyes to detect motion across a 135-degree field of view and predict object trajectories using deep learning.
(Nanowerk News) Researchers at Southeast University in Nanjing, China, have developed an ultrathin nanophotonic sensor that mimics the compound eyes of insects to detect and predict the motion of objects across an ultra-wide field of view. The flat, compact device combines a metalens array with a custom deep neural network to extract precise velocity and direction data from wide-angle scenes, outperforming conventional computer vision methods when tracking small, slow, or camouflaged targets.
|
|
The work, led by Professor Ji Chen and co-corresponding author Professor Zaichen Zhang, was published in Light: Advanced Manufacturing (“Bioinspired planar intelligent nanophotonic sensor for wide-angle accurate motion perception and prediction”).
|
Key Findings
- The planar intelligent nanophotonic sensor achieves a horizontal field of view exceeding 135 degrees using three phase-engineered metalenses on a flat substrate, eliminating the need for curved lens arrays.
- A deep neural network called meta-motion sense extracts optical flow from captured scenes with higher accuracy than the YOLO object recognition algorithm, particularly for tiny, slow-moving, and background-blended targets.
- A lightweight trajectory prediction framework forecasts future positions of multiple moving objects simultaneously, with an average prediction time of 3.21 milliseconds per frame per target.
|
|
Insects such as dragonflies and bees rely on compound eyes made up of many individual photoreceptive units, or ommatidia, arranged on a curved surface. This arrangement gives them a panoramic view, rapid temporal resolution, and acute motion sensitivity. Engineers have attempted to replicate these capabilities in artificial vision systems, but traditional approaches require curved lens arrays that are difficult to fabricate, bulky, and hard to integrate into compact devices.
|
|
Metalenses offer a way around this constraint. These ultrathin optical elements use subwavelength nanostructures to manipulate light within a single flat layer, delivering focusing performance comparable to conventional lenses at a fraction of the size and weight. The Southeast University team arranged three metalenses, each one millimeter in diameter with a 1.6-millimeter focal length, into a planar array.
|
|
Each metalens captures a distinct 45-degree angular range, and the three together cover a horizontal field of view exceeding 135 degrees. The phase profile of each side metalens was generated by adding a tilt phase to the central metalens design, directing its focus off-axis without requiring physical curvature.
|
|
The researchers call their device a planar intelligent nanophotonic sensor, or PINS. Its core optics sit atop a CMOS image sensor connected to a microcontroller. A 550-nanometer bandpass optical filter and a stray light-blocking mask suppress unwanted ambient light to improve contrast and signal quality.
|
|
The mechanical housing was 3D-printed from polylactic acid, and the spacing between the metalens array and the image sensor can be fine-tuned with two screws to optimize focus at different target distances. The final reconstructed wide-angle image has an effective resolution of approximately 480 by 1,440 pixels.
|
|
“The established imaging model effectively solves the problem of insufficient dedicated datasets for metalens vision, while the designed neural network and prediction framework achieve high-precision perception and multi-target trajectory forecasting under interference backgrounds,” the researchers said.
|
|
To enable the sensor to perceive motion, the team built a deep neural network they named meta-motion sense, or MMS. Rather than classifying objects by shape, as the widely used YOLO algorithm does, MMS computes optical flow between consecutive image frames. Optical flow encodes how each pixel shifts from one frame to the next, capturing the speed and direction of every object in the scene without relying on shape-based recognition.
|
|
The network architecture uses a dual-stage multi-scale composite encoder to extract features at different spatial scales. A recurrent update mechanism then iteratively refines the optical flow estimates by sampling correlation features from a precomputed four-dimensional correlation volume pyramid, progressively improving accuracy with each pass.
|
|
Training the MMS network required large volumes of paired image frames with known optical flow ground truth, but no existing public dataset matched the imaging characteristics of a metalens system. The team solved this by experimentally measuring the point spread function of their metalens and using it to transform established optical flow datasets, including FlyingChairs, Sintel, KITTI, and HD1K, into synthetic metalens images.
|
|
Because this convolution preserved spatial consistency, the original optical flow labels could be reused without manual annotation. The approach produced roughly 25,200 training pairs and closely replicated the blur, noise, and brightness of actual metalens captures, as confirmed by side-by-side visual and quantitative comparison.
|
|
In testing, the MMS network detected motion that YOLO missed entirely. When two vehicles of different sizes moved across a wide-angle scene, YOLO identified the larger car but failed to recognize the smaller one because of imaging blur. It also failed to register a large vehicle’s motion when the displacement between frames was small. MMS captured both cases accurately.
|
|
In a more complex scene with multiple moving objects against a cluttered background, YOLO produced both misidentifications and missed detections, while MMS reliably identified every moving target. The trained network achieved an average inference time of 110.15 milliseconds per image pair on an NVIDIA RTX 4090 GPU, corresponding to roughly nine frames per second.
|
|
“Benefiting from the lightweight algorithm architecture and high hardware integration, the system realizes millisecond-level trajectory prediction and stable continuous tracking of multiple crossing and overlapping targets,” the team added.
|
|
The researchers also developed a motion trajectory prediction framework, or MTP, that forecasts where objects will move next. The framework extracts the region of interest from each optical flow map, estimates the centroid position, applies Kalman filtering to smooth velocity data, fits the velocity trend using weighted polynomial regression, and extrapolates future positions. For a single target, average prediction time was 3.21 milliseconds per frame.
|
|
The team tested the MTP on three representative motion patterns: sinusoidal, planar spiral, and electron-like trajectories. At a one-frame prediction horizon (about 33 milliseconds at 30 fps), predicted positions closely matched the actual centroid path across all motion types. At a 30-frame horizon of one second, deviations grew but remained manageable as more historical data accumulated.
|
|
For multi-object scenarios, the MTP framework adds connected-component segmentation and K-means clustering in the angular domain to separate overlapping targets moving in different directions. Each detected object receives a unique identifier maintained across frames through motion continuity. In a test sequence of 300 frames containing five labeled objects, the system continuously tracked all targets and accurately predicted their trajectories 15 frames ahead.
|
|
“The proposed metalens-based bionic sensing scheme not only overcomes the bottlenecks of traditional curved bionic vision and ordinary machine vision, but also provides a feasible technical route for next-generation miniaturized intelligent photoelectric perception,” the researchers said.
|
|
The metalens array was fabricated by depositing a one-micrometer-thick silicon nitride layer on a silicon dioxide substrate. Electron-beam lithography defined cylindrical nanostructures with diameters ranging from 81 to 212 nanometers, selected through simulation to cover a full 2-pi phase range with transmittance exceeding 90 percent. Scanning electron microscopy confirmed the structural uniformity and fabrication precision of the finished array.
|
|
Because the device is compact and lightweight, it can be mounted on miniature unmanned aerial vehicles, wearable electronics, and other portable platforms. The team demonstrated integration with a small drone for environmental sensing. The modular metalens array design also allows expansion of angular coverage by adding metalenses, or adjustment of overlap between adjacent sectors for either greater information redundancy or broader spatial coverage, depending on the application.
|
|
The authors note that several hardware upgrades could push the system closer to real-time operation. CMOS sensors with smaller pixels and faster frame rates would improve spatial and temporal resolution. Inertial measurement data from a drone platform could subtract platform-induced motion from the optical flow, isolating the movement of independent objects. Future work could also extend the field of view into two dimensions for near-omnidirectional observation and add depth-sensing capability through a binocular configuration.
|