Detect to Focus: Latent-Space Autofocusing System with Decentralized Hierarchical Multi-Agent Reinforcement Learning

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 2.74 MB, PDF document

State-of-the-art object detection models are frequently trained offline using available datasets, such as ImageNet: large and overly diverse data that are unbalanced and hard to cluster semantically. This kind of training drops the object detection performance should the change in illumination, in the environmental conditions (e.g., rain or dust), or in the lens positioning (out-of-focus blur) occur. We propose a simple way to intelligently control the camera and the lens focusing settings in such scenarios using DASHA, a Decentralized Autofocusing System with Hierarchical Agents. Our agents learn to focus on scenes in challenging environments, significantly enhancing the pattern recognition capacity beyond the popular detection models (YOLO, Faster R-CNN, and Retina are considered). At the same time, the decentralized training allows preserving the equipment from overheating. The algorithm relies on the latent representation of the camera's stream and, thus, it is the first method to allow a completely no-reference imaging, where the system trains itself to auto-focus itself. The paper introduces a novel method for auto-tuning imaging equipment via hierarchical reinforcement learning. The technique involves the use of two interacting agents which independently manage the camera and lens settings, enabling optimal focus across different lighting situations. The unique aspect of this approach is its dependence on the latent feature vector of the real-time image scene for autofocusing, marking it as the first method of its kind to auto-tune a camera without necessitating reference or calibration data.

Original languageEnglish
JournalIEEE Access
Volume11
Pages (from-to)85214-85223
Number of pages10
ISSN2169-3536
DOIs
Publication statusPublished - 2023

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

    Research areas

  • Artificial intelligence, computer vision, imaging, lenses, multi-agent systems, neural networks, photography, reinforcement learning

ID: 368343910