The demo code gets stuck — GPU memory is occupied, but there’s no output
#3
by
YCG09
- opened
When running the demo code, the process gets stuck. The GPU memory is occupied, but there is no output or progress.
My GPUs are 4 L20s. The GPU memory is properly occupied, and the cores show load, but there's no output — it gets stuck at response = get_assistant().