Tag
1 article
Learn to compress instruction-tuned language models using FP8, GPTQ, and SmoothQuant quantization techniques with llmcompressor, and benchmark their performance.