Integrate with GPT-NeoX¶
GPT-NeoX is a library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context. This library is currently maintained by EleutherAI.
Instrument your runs with Comet to start managing experiments, create dataset versions and track hyperparameters for faster and easier reproducibility and collaboration.
Comet SDK | Minimum SDK version | Minimum GPT-NeoX version |
---|---|---|
Python-SDK | 3.45.0 | master |
Start logging¶
Add the following config or create a separate configuration file:
{
"use_comet": true,
"comet_project": "<your-project-name>",
"comet_experiment_name": "<your-experiment-name>",
"comet_tags": ["<experiment-tag>"],
"comet_others": { "<experiment-other-name>": "<experiment-other-value>" },
}
Tip
Find a full list of GPT-NeoX recipe configs here.
Log automatically¶
When using the Integration, Comet automatically logs the following items, by default, with no additional configuration:
- Training metrics like
train/lm_loss
,timers/forward
andruntime/flops_per_sec_per_gpu
. - All hyperparameters like
data_path
, DeepSpeed configuration and anything else included in the config file.
End-to-end example¶
The following is a basic example of using Comet with GPT-NeoX using GPT2-3B.
Clone the repo¶
git clone https://github.com/EleutherAI/gpt-neox/
Install dependencies¶
python -m pip install -r gpt-neox/requirements/requirements.txt -r gpt-neox/requirements/requirements-comet.txt
Log-in to Comet¶
comet_ml login
Download the training dataset¶
cd gpt-neox && python prepare_data.py enwik8
Write the GPT-NeoX config with Comet¶
Write the following config file to gpt-neox/configs/comet.yml
:
{ "use_comet": true }
Run the example on a single node¶
python ./gpt-neox/deepy.py ./gpt-neox/train.py ./gpt-neox/configs/1-3B.yml ./gpt-neox/configs/slurm_local.yml ./gpt-neox/configs/comet.yml
Nov. 18, 2024