Same as title.I know chunked attention mask is there for context phase. But does LLama4 implement chunked attention mask in generation phase too ?
· Sign up or log in to comment