ThinkRL
GitHub
Features

Integrations

Native integrations with the ML ecosystem.

Hugging Face

Native support for Transformers models and Datasets.

config = ModelConfig(
    model_name_or_path="meta-llama/Llama-3-8b",
    dataset_name="tatsu-lab/alpaca",
    push_to_hub=True,
    hub_model_id="your-org/model-name",
)

Weights & Biases

Experiment tracking and visualization.

config = ModelConfig(
    report_to="wandb",
    wandb_project="thinkrl-training",
    wandb_run_name="vapo-llama3-8b",
)

vLLM

High-throughput inference during RLHF training.

config = ModelConfig(
    use_vllm=True,
    vllm_tensor_parallel_size=2,
    vllm_gpu_memory_utilization=0.9,
)