Fine tuning GPU options

Finetune jobs can be run with different GPUs- affecting speed, price and quality.

Sample Comparison

Overview:

In this exercise, we compare the performance and cost-effectiveness of three different model training pipelines: A100 (Balanced), H100 (Fast), and L40S (Low Cost). The training was conducted using the following parameters:

  • Dataset: 10MB zip file

  • Steps: 500

Training Configuration
{
  "resolution": 1024,
  "repeats": 100,
  "learning_rate": 0.0004,
  "lr_scheduler": "constant",
  "optimizer_type": "Adafactor",
  "num_train_epochs": 500,
  "steps": 500,
  "gradient_accumulation_steps": 2,
  "center_crop": false,
  "lora_rank": 32,
  "noise_offset": 0,
  "max_grad_norm": 0,
  "bucket_steps": 64,
  "weight_decay": 0.01,
  "relative_step": false,
  "auto_caption": false,
  "content_or_style": "balanced",
  "batch_size": 2,
  "linear": 16,
  "linear_alpha": 16,
  "prompt": "driving a F1 car",
  "iterations": 300,
  "captioning": true,
  "priority": "QUALITY"
}

Comparison Table:

Pipeline Name

Characteristics

Charge Rate (/s)

# of Seconds

Total Cost

A100

Balanced

$0.002

1868.67

$3.74

H100

Fast

$0.0043

1085.85

$4.67

L40S

Low Cost

$0.0014

2532.14

$3.54


Insights:

  1. A100 (Balanced):

    1. Offers a balanced performance with moderate cost and time.

    2. Suitable for scenarios where a trade-off between speed and cost is acceptable.

  2. H100 (Fast):

    1. Significantly faster but at a higher charge rate.

    2. Ideal for time-sensitive tasks where speed is prioritized over cost.

  3. L40S (Low Cost):

    1. Lowest cost option but takes the longest time to complete.

    2. Best suited for non-urgent tasks where cost efficiency is more important.

Last updated