知行融创论坛第五期活动成功举办

时间：2025-09-24 分类：支部活动

基础软件与系统重点实验室党支部于2025年9月24日成功举办了“知行融创”论坛第五期活动。本次活动聚焦基础软件与系统领域的前沿研究，四位同志分享了他们在计算机视觉、人工智能安全、轨迹预测验证和硬件设计形式化方法等方面的最新成果，以下是报告的主要内容。

图为活动现场图片

李佳洺. SemARFlow: Injecting Semantics into Unsupervised Optical Flow Estimation for Autonomous Driving
Unsupervised optical flow estimation is especially hard near occlusions and motion boundaries and in low-texture regions. We show that additional information such as semantics and domain knowledge can help better constrain this problem. We introduce SemARFlow, an unsupervised optical flow network designed for autonomous driving data that takes estimated semantic segmentation masks as additional inputs. This additional information is injected into the encoder and into a learned upsampler that refines the flow output. In addition, a simple yet effective semantic augmentation module provides selfsupervision when learning flow and its boundaries for vehicles, poles, and sky. Together, these injections of semantic information improve the KITTI-2015 optical flow test error rate from 11.80% to 8.38%. We also show visible improvements around object boundaries as well as a greater ability to generalize across datasets. Code is available at https://github.com/duke-vision/ semantic-unsup-flow-release.

图中为李佳洺

程络. Shadowcast: Stealthy Data Poisoning Attacks against Vision-Language Models
Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, but their versatility raises security concerns. This study takes the first step in exposing VLMs’ susceptibility to data poisoning attacks that can manipulate responses to innocuous, everyday prompts. We introduce Shadowcast, a stealthy data poisoning attack where poison samples are visually indistinguishable from benign images with matching texts. Shadowcast demonstrates effectiveness in two attack types. The first is a traditional Label Attack, tricking VLMs into misidentifying class labels, such as confusing Donald Trump for Joe Biden. The second is a novel Persuasion Attack, leveraging VLMs’ text generation capabilities to craft persuasive and seemingly rational narratives for misinformation, such as portraying junk food as healthy. We show that Shadowcast effectively achieves the attacker’s intentions using as few as 50 poison samples. Crucially, the poisoned samples demonstrate transferability across different VLM architectures, posing a significant concern in black-box settings. Moreover, Shadowcast remains potent under realistic conditions involving various text prompts, training data augmentation, and image compression techniques. This work reveals how poisoned VLMs can disseminate convincing yet deceptive misinformation to everyday, benign users, emphasizing the importance of data integrity for responsible VLM deployments. Our code is available at: https://github.com/umd-huang-lab/VLM-Poisoning.

图中为程络

张亮. TrajPAC: Towards Robustness Verification of Pedestrian Trajectory Prediction Models
Robust pedestrian trajectory forecasting is crucial to developing safe autonomous vehicles. Although previous works have studied adversarial robustness in the context of trajectory forecasting, some significant issues remain unaddressed. In this work, we try to tackle these crucial problems. Firstly, the previous definitions of robustness in trajectory prediction are ambiguous. We thus provide formal definitions for two kinds of robustness, namely label robustness and pure robustness. Secondly, as previous works fail to consider robustness about all points in a disturbance interval, we utilise a probably approximately correct (PAC) framework for robustness verification. Additionally, this framework can not only identify potential counterexamples, but also provides interpretable analyses of the original methods. Our approach is applied using a prototype tool named TRAJPAC. With TRAJPAC, we evaluate the robustness of four state-of-the-art trajectory prediction models on trajectories from five scenes of the ETH/UCY dataset and scenes of the Stanford Drone Dataset. Using our framework, we also experimentally study various factors that could influence robustness performance.

图中为张亮

王珂音. A Formally Verified Procedure for Width Inference in FIRRTL
FIRRTL is an intermediate representation language for Register Transfer Level (RTL) hardware designs. In FIRRTL programs, the bit widths of many components are not given explicitly, which must be inferred during compilation. In mainstream FIRRTL compilers such as the official compiler firtool, the width inference is conducted by a compilation pass referred to as InferWidths, which may fail even for simple FIRRTL programs. In this paper, we thoroughly investigate the width inference problem for FIRRTL programs. We show that if the constraint obtained from a FIRRTL program is satisfiable, there must exist a unique least solution. Based on this foundational result, we propose a complete procedure for solving the width inference problem, which can handle FIRRTL programs where firtool may fail. We implement the procedure in the interactive theorem prover Rocq and prove its functional correctness. From the Rocq implementation, we extract an OCaml implementation, which is the first formally verified InferWidths pass. Extensive experiments demonstrate that it can solve more instances than the official InferWidths pass in firtool, normally with high efficiency.

图中为王珂音

本期“知行融创”论坛围绕基础软件与系统领域的前沿挑战，展示了实验室党员在计算机视觉、人工智能安全、轨迹预测验证、硬件设计形式化方法等方向的创新成果。论坛持续为党员同志搭建了高水平的学术交流与思想碰撞平台，有效促进了不同研究方向间的交叉融合，生动实践了党建与科研业务的深度融合，激励党员在服务国家重大战略需求的科研一线攻坚克难、勇攀高峰。