Does LLama4 have chunked attention in generation phase ?

#64
by vanshils - opened

Same as title.
I know chunked attention mask is there for context phase. But does LLama4 implement chunked attention mask in generation phase too ?

Sign up or log in to comment