LLM Scaling law and Efficiency

Efficiency

In this session, our readings cover:

Required Readings:

Scaling Laws for Neural Language Models

Efficient Large Language Models: A Survey

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

More Readings:

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

LIMA: Less Is More for Alignment /