Why did num_key_value_heads change from 16 to 8?

#1
by codys12 - opened

Was the previous version of the model even usable or was it bugged out?

Problem after config change: RuntimeError: shape '[25, 1024, 8, 128]' is invalid for input of size 52428800

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment