Model Optimization & Efficient AI
Scope
Model compression and acceleration: low-rank approximation, LoRA, quantization, pruning, and related methods.
Keywords
Compression, parameter efficiency, inference speed, memory optimization
Model compression and acceleration: low-rank approximation, LoRA, quantization, pruning, and related methods.
Compression, parameter efficiency, inference speed, memory optimization