ThinkRL
GitHub
Features

Optimization

Performance optimizations for faster and more memory-efficient training.

Sequence Packing

Pack multiple sequences into a single batch for 2x faster training.

config = ModelConfig(
    use_packing=True,
    max_packed_length=4096,
)

LoRA / QLoRA

Parameter-efficient fine-tuning for memory-constrained environments.

config = ModelConfig(
    use_lora=True,
    lora_r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    # QLoRA
    load_in_4bit=True,
)

Gradient Checkpointing

Trade compute for memory to train larger models.

config = ModelConfig(
    gradient_checkpointing=True,
    gradient_accumulation_steps=8,
)