Integrate SwanLab for offline/online experiment tracking and local visualization #36433

ShaohonChen · 2025-02-26T19:44:24Z

What does this PR do?

This PR introduces SwanLab, a lightweight open-source experiment tracking tool, as a new logging option for the training framework. The integration provides both online and offline tracking capabilities, along with a local dashboard for visualizing results.

SwanLab has previously supported tracking the Transformers training framework through external callbacks (find more information here), serving a wide range of users—especially those in regions with limited network connectivity, such as China. With this official integration, we aim to further enhance the developer experience by making tracking more seamless and user-friendly.

Additional information about this PR is a detailed overview of the changes and usage instructions.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests? (I don't see any tests for any of the callbacks but please let me know if I missed them somewhere. )

Who can review?

@SunMarc I have reviewed recent merges related to training tracking, and it seems that you are the most relevant reviewer for this PR. Could you please help review it or forward it to the appropriate person? Thank you!

Additional information about this PR

Key Features of SwanLab Integration

1. Online and Offline Tracking:

Online Mode: Track experiments remotely and store data on SwanLab's cloud platform.
Offline Mode: Use a local dashboard to visualize training logs without an internet connection.

2. Hardware Monitoring:

Automatically tracks GPU usage, power consumption, temperature, and other hardware metrics.
Supports NVIDIA GPUs and Huawei Ascend NPUs.

3. Remote Access:

View training progress remotely via the SwanLab web interface or mobile app.

4. Local Dashboard:

Includes an open-source local dashboard for offline visualization of training logs.

Usage guidline

Step 0: Set Up code and environment

Following the transformers official text classification example:

# prepare code and environments
git clone https://github.com/huggingface/transformers
cd transformers
pip install -e .
cd examples/pytorch/text-classification
pip install -r requirements.txt
#

Step 1: Set Up SwanLab Online Tracking

Install:

pip install swanlab

To use SwanLab's online tracking, log in to the SwanLab website and obtain your API key from the Settings page. Then, authenticate using the following command:

swanlab login

If you prefer offline mode, skip this step.

Step 2: Configure SwanLab as the Logger and run example

To enable SwanLab as the experiment tracker, add --use_swanlab to your training command. For example, using the workflow:

python run_glue.py \
  --model_name_or_path google-bert/bert-base-cased \
  --dataset_name imdb  \
  --do_train \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir /tmp/imdb/

visualization demo

If you want to use local tracking, you simply set environment variable following:

# Linux & Mac
export SWANLAB_MODE = "local"
# Win
$env:SWANLAB_MODE = "local"

Alternatively, you can configure SwanLab using environment variables:

export SWANLAB_API_KEY=<your_api_key>          # Set API key for online tracking
export SWANLAB_LOG_DIR=<local_log_path>        # Set local log directory
export SWANLAB_MODE=<mode>                    # Set tracking mode: cloud (default), cloud-only, local, or disabled

Step 3: View Training Logs

After logging in, you will see a confirmation message:

Online Tracking: View logs on the SwanLab website.

For more details, refer to the SwanLab Cloud Documentation.

Offline Tracking: Use the local dashboard to visualize logs:

here for more information.

swanlab watch

For advanced configurations, such as setting a custom port, refer to the Offline Dashboard Documentation and CLI Documentation.

…in transformers - Integrated SwanLab into the transformers library as an alternative for experiment tracking. - Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`. - Added necessary dependencies and documentation for SwanLab integration.

SunMarc

Thanks for adding this ! Just a few nits !

src/transformers/integrations/integration_utils.py

SunMarc · 2025-02-28T14:27:38Z

src/transformers/integrations/integration_utils.py

+        - **SWANLAB_PROJECT** (`str`, *optional*, defaults to `None`):
+            Set this to a custom string to store results in a different project. If not specified, the name of the running
+            directory is used.
+


maybe typo ?

Suggested change

- **SWANLAB_PROJECT** (`str`, *optional*, defaults to `None`):

Set this to a custom string to store results in a different project. If not specified, the name of the running

directory is used.

- **SWANLAB_PROJECTS** (`str`, *optional*, defaults to `None`):

Set this to a custom string to store results in a different project. If not specified, the name of the running

directory is used.

Good catch! I’ve fixed the typo.

src/transformers/integrations/integration_utils.py

ShaohonChen · 2025-02-28T16:49:28Z

Thanks for the review! I'll update it shortly.

Co-authored-by: Marc Sun <[email protected]>

ShaohonChen

I have made the required changes and fixed the typo. Thanks to @SunMarc for the review!

ShaohonChen · 2025-02-28T16:59:11Z

src/transformers/integrations/integration_utils.py

+        - **SWANLAB_PROJECT** (`str`, *optional*, defaults to `None`):
+            Set this to a custom string to store results in a different project. If not specified, the name of the running
+            directory is used.
+


Good catch! I’ve fixed the typo.

…ansformers into integrate-swanlab

ShaohonChen · 2025-02-28T17:38:55Z

I ran the example in examples/pytorch/text-classification with the updated code. Everything looks good!
Here is a screenshot and the link to the test run:

results here

Really appreciate your suggestions. Please have a look when you get a chance!

SunMarc

Thanks for iterating !

src/transformers/integrations/integration_utils.py

SAKURA-CAT · 2025-03-01T06:01:46Z

It seems the test timed out. Could it be due to excessive computational load?

Fix: annotation

chore: opt some comments

SAKURA-CAT · 2025-03-01T07:15:14Z

It seems the test timed out. Could it be due to excessive computational load?

Re-running the test succeeded. 🤔

ShaohonChen · 2025-03-01T07:26:54Z

@SunMarc I have updated the code and fixed the incorrect comments. After merging the latest main branch, I noticed that the automated tests seem to have failed. I checked, and it doesn't appear to be an issue caused by my changes. Could you help me restart the tests?

ShaohonChen added 4 commits February 27, 2025 01:11

add swanlab integration

f1747a6

Fix the spelling error of SwanLabCallback in callback.md

a0ace66

Merge branch 'main' into integrate-swanlab

2d9b7e2

ShaohonChen force-pushed the integrate-swanlab branch from 012a976 to 2d9b7e2 Compare February 28, 2025 05:00

SunMarc reviewed Feb 28, 2025

View reviewed changes

ShaohonChen and others added 2 commits March 1, 2025 00:59

Apply suggestions from code review

f3854d1

Co-authored-by: Marc Sun <[email protected]>

Fix typo in comment

83d9022

ShaohonChen commented Feb 28, 2025

View reviewed changes

ShaohonChen added 2 commits March 1, 2025 01:13

Fix typo in comment

d152803

Merge branch 'integrate-swanlab' of https://github.com/ShaohonChen/tr…

52c64e1

…ansformers into integrate-swanlab

SunMarc approved these changes Feb 28, 2025

View reviewed changes

SunMarc reviewed Feb 28, 2025

View reviewed changes

src/transformers/integrations/integration_utils.py Show resolved Hide resolved

Fix typos and update comments

b309479

Zeyi-Lin and others added 5 commits March 1, 2025 14:30

fix annotation

33048a2

Merge pull request #1 from Zeyi-Lin/integrate-swanlab

72a3f9d

Fix: annotation

chore: opt some comments

dd85054

Merge pull request #2 from SAKURA-CAT/integrate-swanlab

6f51ccf

chore: opt some comments

Merge branch 'main' into integrate-swanlab

1ffd8b5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate SwanLab for offline/online experiment tracking and local visualization #36433

Integrate SwanLab for offline/online experiment tracking and local visualization #36433

ShaohonChen commented Feb 26, 2025 •

edited

Loading

SunMarc left a comment

SunMarc Feb 28, 2025

ShaohonChen Feb 28, 2025

ShaohonChen commented Feb 28, 2025

ShaohonChen left a comment

ShaohonChen Feb 28, 2025

ShaohonChen commented Feb 28, 2025 •

edited

Loading

SunMarc left a comment

SAKURA-CAT commented Mar 1, 2025 •

edited

Loading

SAKURA-CAT commented Mar 1, 2025

ShaohonChen commented Mar 1, 2025

Integrate SwanLab for offline/online experiment tracking and local visualization #36433

Are you sure you want to change the base?

Integrate SwanLab for offline/online experiment tracking and local visualization #36433

Conversation

ShaohonChen commented Feb 26, 2025 • edited Loading

What does this PR do?

Before submitting

Who can review?

Additional information about this PR

Key Features of SwanLab Integration

Usage guidline

SunMarc left a comment

Choose a reason for hiding this comment

SunMarc Feb 28, 2025

Choose a reason for hiding this comment

ShaohonChen Feb 28, 2025

Choose a reason for hiding this comment

ShaohonChen commented Feb 28, 2025

ShaohonChen left a comment

Choose a reason for hiding this comment

ShaohonChen Feb 28, 2025

Choose a reason for hiding this comment

ShaohonChen commented Feb 28, 2025 • edited Loading

SunMarc left a comment

Choose a reason for hiding this comment

SAKURA-CAT commented Mar 1, 2025 • edited Loading

SAKURA-CAT commented Mar 1, 2025

ShaohonChen commented Mar 1, 2025

ShaohonChen commented Feb 26, 2025 •

edited

Loading

ShaohonChen commented Feb 28, 2025 •

edited

Loading

SAKURA-CAT commented Mar 1, 2025 •

edited

Loading