Abstract: Recent Multi-modal Large Language Models (MLLMs) have been challenged by the computational overhead resulting from massive video frames, often alleviated through compression strategies.
Abstract: Efficient and accurate multi-modal data transmission remains a key challenge in semantic communications. Existing research has overlooked the multi-level semantic similarity inherent in ...