Keyword: object detection : Search

keynote

From Pixels to Preservation: The Power of Large Vision Models in Heritage Content Understanding

Jing Zhang

SUMAC '24: Proceedings of the 6th workshop on the analySis, Understanding and proMotion of heritAge ContentsPages 3–4https://doi.org/10.1145/3689094.3689470

Preserving cultural heritage is essential for maintaining the legacy and history of human civilization, but it presents challenges in managing vast amounts of historical artifacts and documents. Recent advances in artificial intelligence, especially ...

research-article

Open Access

Multimodal Understanding: Investigating the Capabilities of Large Multimodal Models for Object Detection in XR Applications

LGM3A '24: Proceedings of the 2nd Workshop on Large Generative Models Meet Multimodal ApplicationsPages 26–35https://doi.org/10.1145/3688866.3689126

Extended Reality (XR), encompassing the concepts of augmented, virtual, and mixed reality, has the potential to offer unprecedented types of user interactions. An essential requirement is the automated understanding of a user's current scene, for ...

research-article

Domain Adaptive Object Detection for UAV-based Images by Robust Representation Learning and Multiple Pseudo-label Aggregation

EMCLR'24: Proceedings of the 1st International Workshop on Efficient Multimedia Computing under LimitedPages 59–67https://doi.org/10.1145/3688863.3689576

Object detection on aerial images captured by Unmanned Aerial Vehicles (UAVs) has a wide range of applications. Due to the variations in illumination, weather conditions and scene backgrounds, the testing images (target domain) typically exhibit ...

research-article

Instance-aware Fine-grained Micro-action Recognition

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 11320–11326https://doi.org/10.1145/3664647.3688976

Micro-action involves low-amplitude movement of human body, which brings challenges to common action recognition. This paper focuses on the extremely small region of human body as well as the severe long-tail distribution in micro-action recognition. An ...

research-article

Fractional Correspondence Framework in Detection Transformer

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 5498–5506https://doi.org/10.1145/3664647.3681613

The Detection Transformer (DETR), by incorporating the Hungarian algorithm, has significantly simplified the matching process in object detection tasks. This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth ...

research-article

Open Access

Uni-YOLO: Vision-Language Model-Guided YOLO for Robust and Fast Universal Detection in the Open World

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 1991–2000https://doi.org/10.1145/3664647.3681212

Universal object detectors aim to detect any object in any scene without human annotation, exhibiting superior generalization. However, the current universal object detectors show degraded performance in harsh weather, and their insufficient real-time ...

research-article

EPL-UFLSID: Efficient Pseudo Labels-Driven Underwater Forward-Looking Sonar Images Object Detection

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 4349–4357https://doi.org/10.1145/3664647.3681160

Sonar imaging is widely utilized in submarine and underwater detection missions. However, due to the complex underwater environment, sonar images suffer from complex distortions and noises, making detection models hard to extract clean high-level ...

research-article

Adaptive Hierarchical Aggregation for Federated Object Detection

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 3732–3740https://doi.org/10.1145/3664647.3681158

In practical object detection scenarios, distributed data and stringent privacy protections significantly limit the feasibility of traditional centralized training methods. Federated learning (FL) emerges as a promising solution to this dilemma. ...

research-article

SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 4851–4860https://doi.org/10.1145/3664647.3681043

Recent years have seen an increase in the use of gigapixel-level image and video capture systems and benchmarks with high-resolution wide (HRW) shots. However, unlike close-up shots in the MS COCO dataset, the higher resolution and wider field of view ...

research-article

Purified Distillation: Bridging Domain Shift and Category Gap in Incremental Object Detection

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 1197–1205https://doi.org/10.1145/3664647.3681031

Incremental Object Detection (IOD) simulates the dynamic data flow in real-world applications, which require detectors to learn new classes or adapt to new domains while retaining knowledge from previous tasks. Most existing IOD methods focus only on ...

research-article

Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 3284–3293https://doi.org/10.1145/3664647.3680962

Object detectors often suffer a decrease in performance due to the large domain gap between the training data (source domain) and real-world data (target domain). Diffusion-based generative models have shown remarkable abilities in generating high-...

research-article

WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 7947–7956https://doi.org/10.1145/3664647.3680960

Weakly-supervised visual recognition using inexact supervision is a critical yet challenging learning problem. It significantly reduces human labeling costs and traditionally relies on multi-instance learning and pseudo-labeling. This paper introduces ...

research-article

Alleviating the Equilibrium Challenge with Sample Virtual Labeling for Adversarial Domain Adaptation

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 2681–2689https://doi.org/10.1145/3664647.3680929

Many domain adaptive object detection (DAOD) methods employ domain adversarial training to align features and mitigate the domain gap. In this approach, a feature extractor is trained to deceive a domain classifier, thereby aligning feature ...

research-article

LOVD: Large-and-Open Vocabulary Object Detection

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 9321–9329https://doi.org/10.1145/3664647.3680925

Existing open-vocabulary object detectors require an accurate and compact vocabulary pre-defined during inference. Their performance is largely degraded in real scenarios where the underlying vocabulary may be indeterminate and often exponentially large. ...

research-article

Stochastic Context Consistency Reasoning for Domain Adaptive Object Detection

MM '24: Proceedings of the 32nd ACM International Conference on MultimediaPages 1331–1340https://doi.org/10.1145/3664647.3680899

Domain Adaptive Object Detection (DAOD) aims to improve the adaptation of the detector for the unlabeled target domain by the labeled source domain. Recent advances leverage a self-training framework to enable a student model to learn the target domain ...

research-article

Open Access

mmBox: Harnessing Millimeter-Wave Signals for Reliable Vehicle and Pedestrians Detection

ACM Transactions on Internet of Things (TIOT), Volume 5, Issue 4Article No.: 22, Pages 1–30https://doi.org/10.1145/3695883

Object detection plays a pivotal role in various fields, for example, a smart traffic system relies on the detected results for decision-making. However, existing studies predominately utilize optical camera and LiDAR, which exhibit limitations in adverse ...

short-paper

Free

M2IoU: A Min-Max Distance-based Loss Function for Bounding Box Regression in Medical Imaging

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge ManagementPages 4041–4045https://doi.org/10.1145/3627673.3679958

Computer vision applications such as object detection have increased manifolds in the medical domain for diagnosis and treatment purposes. Generally, object detection models such as YOLO(You Only Look Once) involve identifying the correct bounding box ...

short-paper

Open Access

Intricate Object Detection in Self Driving Environments with Edge-Adaptive Depth Estimation(EADE)

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge ManagementPages 3837–3841https://doi.org/10.1145/3627673.3679948

Autonomous vehicles make decisions and controls based on various object recognition results. The driving environment is characterized by the coexistence of a multitude of objects of varying shapes and sizes. Therefore, the ability to accurately recognise ...

research-article

Free

JUST ACCEPTED

Learning and Vision-based approach for Human fall detection and classification in naturally occurring scenes using video data

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Just Accepted https://doi.org/10.1145/3687125

The advancement of medicine presents challenges for modern cultures, especially with unpredictable elderly falling incidents anywhere due to serious health issues. Delayed rescue for at-risk elders can be dangerous. Traditional elder safety methods like ...

research-article

Open Access

Basic Safety Message Generation through a Video-based Analytics for Potential Safety Applications

ACM Journal on Autonomous Transportation Systems (JATS), Volume 1, Issue 4Article No.: 23, Pages 1–26https://doi.org/10.1145/3643823

With the advancement of modern artificial intelligence techniques, computer vision can play a vital role in enhancing roadway safety by reducing the risk of imminent collisions. To do so, a vision-based safety application is required, where a roadside ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences