Model Status

Model status can be cold and warm refer to how long it takes to launch a machine learning model to accept requests.

  • Cold Boot: When a model hasn't been used in a while, it gets turned off to conserve resources. This is similar to completely turning off your computer. When you make a request to use the model again, it needs to be fully loaded and started up, which can take several minutes for large models. This is a cold boot.

  • Warm Boot: If a model has been used recently, it stays loaded and ready to accept requests. This is similar to putting your computer in sleep mode. When you use a warm model, the response is much faster because the model is already up and running.

Here's why this happens:

  • Segmind has a large library of models, and keeping them all running all the time would use a lot of resources.

  • They only run the models that are actually being used.

  • Cold boots happen more often for less frequently used models.

Last updated