Hudeifa Hassan

CS Student @ SRH Berlin

Hi, I'm Hudeifa. I'm a CS student at SRH Berlin working on efficient inference for video generation models. My current focus is on autoregressive video diffusion. I'm investigating sparse attention patterns, quantization, and inference-time optimizations to make long-form video generation practical.

Research

Sparse Attention Patterns in Video Diffusion: Do They Preserve Semantic Consistency Across Frames?

● in progress

Sparse and linear attention approximations reduce compute in spatial dimensions, but video requires temporal attention for consistency. We're probing whether existing sparse attention masks used for efficiency inadvertently disrupt cross-frame token interactions, and whether simple structured sparsity patterns — like attending to the first frame plus a local window — can recover consistency at lower cost. Directly relevant to autoregressive streaming pipelines where full temporal attention is the main bottleneck.

Sparse Attention Video Diffusion Temporal Consistency SLA SageAttention

Temporal Attention Drift in Autoregressive Video Diffusion

● in progress

As autoregressive video models generate longer sequences, how do temporal attention patterns evolve? We're studying whether attention to early frames dilutes over time and whether this directly causes the quality degradation seen in long sequences. Testing whether forced attention anchoring to keyframes can recover consistency without retraining. Pure inference-time analysis — gives a mechanistic explanation for what Long Live and Self-Forcing++ are implicitly trying to fix.

AR Video Attention Drift Keyframe Anchoring Inference Analysis

Recent Posts

Nothing here yet. Check back soon.

→ read the blog