Search Docs…

Guide

Stable Diffusion & ControlNet

Stable Diffusion

There are two ways to use Stable Diffusion with Odyssey: through API calls to Stability AI’s DreamStudio and locally. This section will help you select which is best for your use case.

The Stable Diffusion API makes calls to Stability AI’s DreamStudio endpoint. To use this node, you will need to add your API key which can be found here. If you don’t have a Stability AI account, you will need to create one. You will need to add credits to your account to access the API key.

Generating images with the API will incur a cost. Complete pricing details are outlined here but it’s around $10 to generate roughly 500 SDXL images - or $0.02/image. 

The advantages of using the API are:

  • Image generation can be faster - especially for older computers

  • The API will support a wider range of Stable Diffusion models

  • The API may support newer models that are not yet running locally

The primary disadvantage is the cost of running the model as well as being unable to use ControlNet with the API.

With the API node, we support the following configuration parameters:

  • Engine is the Stable Diffusion model. Odyssey supports Version 1.0 XL (1024 x 1024), Version 0.9 XL (1024x1024), Version 2.2.2 XL Beta (512x512), and Version 1.6 (512 x 512)

  • Steps are the number of times your image is diffused. The more steps, the longer it will take for the image to generate. We default the number of steps on the API node to 50

  • Seed is the value that Stable Diffusion uses to start your image. Seed values are randomized to start but will stay consistent after your first generation. Click the dice to randomize a seed value.

  • Guidance scale is how closely Stable Diffusion follows your prompt

  • Number of images is how many images you want to see from your output. If you select more than one, you will need to connect a ‘batch images’ node to Stable Diffusion to display multiple images

  • Safe mode turns on Stable Diffusion’s NSFW filter

  • Starting image influence dictates the % of steps that will be devoted to following the connected image input. The more influence the starting image has, the more closely your result will match the image

  • Guidance preset is an advanced setting and does xxxx

  • Scheduler is an advanced setting and defaulted to K-DPMPP_2M. Here’s a comparison of all the different schedulers for the same prompt to give a sense of the differences

A comparison chart showing different schedulers for Stable Diffusion
  • Style contains presets that come from the Stable Diffusion APIs. Similar effects can be accomplished through prompting

There are a few key differences with the locally run Stable Diffusion node:

  • Model options are dictated by which models you download when you first start using the app. You can also upload custom models to Odyssey - which are covered below.

  • Compute units give you the option to run Stable Diffusion on your CPU & GPU, CPU & Neural Engine, or all three. We have not found significant performance differences across these three options but there may be some differences when running Stable Diffusion on an older computer

  • Reduce memory usage is useful if you’re running Odyssey on an older Mac. While generation time may be slower, reducing memory usage will help ensure that Odyssey does not crash due to using up too much memory

Inpainting

Inpainting can be done through the Mask (Inpainting) input on your Stable Diffusion node. By connecting a mask to the Mask (Inpainting) input, you control which area of an image is manipulated by the model.

To retain a mask, use the Remove Background node then connect the Mask output to the Mask (Inpainting) input.

An image of inpainting in Odyssey

ControlNet

Different images showing the Mona Lisa with image effects applied

ControlNet is a method of controlling certain regions of an image generated with Stable Diffusion. Odyssey supports a few different methods of leveraging ControlNet through the locally run Stable Diffusion node. These methods can be found in “conditioning.” To learn more about ControlNet, read our tutorial here.

A few notes about using Control Net before we look at each individual option:

  • Like Stable Diffusion, ControlNet models will need to be downloaded. You can find all supported ControlNet options in Settings > Downloads

  • You can select one or multiple ControlNet options by checking off the corresponding checkbox

  • Each ControlNet input has a corresponding node that can then be connected to the ControlNet input

The ControlNet options that Odyssey currently supports are:

  • Canny Edges uses a Canny edge-detection algorithm to show the edges in an image. You can use the Canny node to control which parts of an image that Stable Diffusion will draw into. Canny works well for objects and structured poses, but it can also outline facial features such as wrinkles

  • Holistic edges uses holistically nested edge detection to draw edges in an image with softer, less crisp outlines. HED is considered better at preserving details than canny edges 

  • MLSD corresponds with Odyssey’s “trace edges” node. Trace edges will draw the edges found in an image but, unlike canny or holistic edges, retains the image’s color. You can combine trace edges with a desaturation node to create an image for the MLSD input

  • Scribble can take a drawing and use the drawing to impact the generated image. This is especially impactful when you use your phone or iPad from a blank image node and draw, for example, a smiley face.

  • Line art

  • Line art anime

  • Depth uses the depth map node to take a grayscale image that represents the distance of objects in the original image to the camera. Depth is often considered an enhanced version of image-to-image and can help you synthesize subject and background separately

  • Mask uses the segmentation from the remove background node to keep the output of Stable Diffusion consistent

  • Instruction

  • Tile adds detail to an image that lacks detail. To use Tile, take a portion of an image and then run it into the Tile input on Stable Diffusion. The output will fill in a significant amount of detail

  • QR code uses the QR Monster ControlNet model to engrain QR codes, patterns such as spirals, and text into an image

You can also change the conditioning for ControlNet with the following options:

  • Conditioning start dictates which step your ControlNet input will begin impacting the image

  • Conditioning end dictates which step your ControlNet input will stop impacting the image

  • Conditioning strength determines how much the ControlNet input impacts the steps it impacts

  • Conditioning guidance determines how much the image generation adheres to the ControlNet input. Higher guidance means higher adherence to the input


Stable Diffusion

There are two ways to use Stable Diffusion with Odyssey: through API calls to Stability AI’s DreamStudio and locally. This section will help you select which is best for your use case.

The Stable Diffusion API makes calls to Stability AI’s DreamStudio endpoint. To use this node, you will need to add your API key which can be found here. If you don’t have a Stability AI account, you will need to create one. You will need to add credits to your account to access the API key.

Generating images with the API will incur a cost. Complete pricing details are outlined here but it’s around $10 to generate roughly 500 SDXL images - or $0.02/image. 

The advantages of using the API are:

  • Image generation can be faster - especially for older computers

  • The API will support a wider range of Stable Diffusion models

  • The API may support newer models that are not yet running locally

The primary disadvantage is the cost of running the model as well as being unable to use ControlNet with the API.

With the API node, we support the following configuration parameters:

  • Engine is the Stable Diffusion model. Odyssey supports Version 1.0 XL (1024 x 1024), Version 0.9 XL (1024x1024), Version 2.2.2 XL Beta (512x512), and Version 1.6 (512 x 512)

  • Steps are the number of times your image is diffused. The more steps, the longer it will take for the image to generate. We default the number of steps on the API node to 50

  • Seed is the value that Stable Diffusion uses to start your image. Seed values are randomized to start but will stay consistent after your first generation. Click the dice to randomize a seed value.

  • Guidance scale is how closely Stable Diffusion follows your prompt

  • Number of images is how many images you want to see from your output. If you select more than one, you will need to connect a ‘batch images’ node to Stable Diffusion to display multiple images

  • Safe mode turns on Stable Diffusion’s NSFW filter

  • Starting image influence dictates the % of steps that will be devoted to following the connected image input. The more influence the starting image has, the more closely your result will match the image

  • Guidance preset is an advanced setting and does xxxx

  • Scheduler is an advanced setting and defaulted to K-DPMPP_2M. Here’s a comparison of all the different schedulers for the same prompt to give a sense of the differences

A comparison chart showing different schedulers for Stable Diffusion
  • Style contains presets that come from the Stable Diffusion APIs. Similar effects can be accomplished through prompting

There are a few key differences with the locally run Stable Diffusion node:

  • Model options are dictated by which models you download when you first start using the app. You can also upload custom models to Odyssey - which are covered below.

  • Compute units give you the option to run Stable Diffusion on your CPU & GPU, CPU & Neural Engine, or all three. We have not found significant performance differences across these three options but there may be some differences when running Stable Diffusion on an older computer

  • Reduce memory usage is useful if you’re running Odyssey on an older Mac. While generation time may be slower, reducing memory usage will help ensure that Odyssey does not crash due to using up too much memory

Inpainting

Inpainting can be done through the Mask (Inpainting) input on your Stable Diffusion node. By connecting a mask to the Mask (Inpainting) input, you control which area of an image is manipulated by the model.

To retain a mask, use the Remove Background node then connect the Mask output to the Mask (Inpainting) input.

An image of inpainting in Odyssey

ControlNet

Different images showing the Mona Lisa with image effects applied

ControlNet is a method of controlling certain regions of an image generated with Stable Diffusion. Odyssey supports a few different methods of leveraging ControlNet through the locally run Stable Diffusion node. These methods can be found in “conditioning.” To learn more about ControlNet, read our tutorial here.

A few notes about using Control Net before we look at each individual option:

  • Like Stable Diffusion, ControlNet models will need to be downloaded. You can find all supported ControlNet options in Settings > Downloads

  • You can select one or multiple ControlNet options by checking off the corresponding checkbox

  • Each ControlNet input has a corresponding node that can then be connected to the ControlNet input

The ControlNet options that Odyssey currently supports are:

  • Canny Edges uses a Canny edge-detection algorithm to show the edges in an image. You can use the Canny node to control which parts of an image that Stable Diffusion will draw into. Canny works well for objects and structured poses, but it can also outline facial features such as wrinkles

  • Holistic edges uses holistically nested edge detection to draw edges in an image with softer, less crisp outlines. HED is considered better at preserving details than canny edges 

  • MLSD corresponds with Odyssey’s “trace edges” node. Trace edges will draw the edges found in an image but, unlike canny or holistic edges, retains the image’s color. You can combine trace edges with a desaturation node to create an image for the MLSD input

  • Scribble can take a drawing and use the drawing to impact the generated image. This is especially impactful when you use your phone or iPad from a blank image node and draw, for example, a smiley face.

  • Line art

  • Line art anime

  • Depth uses the depth map node to take a grayscale image that represents the distance of objects in the original image to the camera. Depth is often considered an enhanced version of image-to-image and can help you synthesize subject and background separately

  • Mask uses the segmentation from the remove background node to keep the output of Stable Diffusion consistent

  • Instruction

  • Tile adds detail to an image that lacks detail. To use Tile, take a portion of an image and then run it into the Tile input on Stable Diffusion. The output will fill in a significant amount of detail

  • QR code uses the QR Monster ControlNet model to engrain QR codes, patterns such as spirals, and text into an image

You can also change the conditioning for ControlNet with the following options:

  • Conditioning start dictates which step your ControlNet input will begin impacting the image

  • Conditioning end dictates which step your ControlNet input will stop impacting the image

  • Conditioning strength determines how much the ControlNet input impacts the steps it impacts

  • Conditioning guidance determines how much the image generation adheres to the ControlNet input. Higher guidance means higher adherence to the input


Stable Diffusion

There are two ways to use Stable Diffusion with Odyssey: through API calls to Stability AI’s DreamStudio and locally. This section will help you select which is best for your use case.

The Stable Diffusion API makes calls to Stability AI’s DreamStudio endpoint. To use this node, you will need to add your API key which can be found here. If you don’t have a Stability AI account, you will need to create one. You will need to add credits to your account to access the API key.

Generating images with the API will incur a cost. Complete pricing details are outlined here but it’s around $10 to generate roughly 500 SDXL images - or $0.02/image. 

The advantages of using the API are:

  • Image generation can be faster - especially for older computers

  • The API will support a wider range of Stable Diffusion models

  • The API may support newer models that are not yet running locally

The primary disadvantage is the cost of running the model as well as being unable to use ControlNet with the API.

With the API node, we support the following configuration parameters:

  • Engine is the Stable Diffusion model. Odyssey supports Version 1.0 XL (1024 x 1024), Version 0.9 XL (1024x1024), Version 2.2.2 XL Beta (512x512), and Version 1.6 (512 x 512)

  • Steps are the number of times your image is diffused. The more steps, the longer it will take for the image to generate. We default the number of steps on the API node to 50

  • Seed is the value that Stable Diffusion uses to start your image. Seed values are randomized to start but will stay consistent after your first generation. Click the dice to randomize a seed value.

  • Guidance scale is how closely Stable Diffusion follows your prompt

  • Number of images is how many images you want to see from your output. If you select more than one, you will need to connect a ‘batch images’ node to Stable Diffusion to display multiple images

  • Safe mode turns on Stable Diffusion’s NSFW filter

  • Starting image influence dictates the % of steps that will be devoted to following the connected image input. The more influence the starting image has, the more closely your result will match the image

  • Guidance preset is an advanced setting and does xxxx

  • Scheduler is an advanced setting and defaulted to K-DPMPP_2M. Here’s a comparison of all the different schedulers for the same prompt to give a sense of the differences

A comparison chart showing different schedulers for Stable Diffusion
  • Style contains presets that come from the Stable Diffusion APIs. Similar effects can be accomplished through prompting

There are a few key differences with the locally run Stable Diffusion node:

  • Model options are dictated by which models you download when you first start using the app. You can also upload custom models to Odyssey - which are covered below.

  • Compute units give you the option to run Stable Diffusion on your CPU & GPU, CPU & Neural Engine, or all three. We have not found significant performance differences across these three options but there may be some differences when running Stable Diffusion on an older computer

  • Reduce memory usage is useful if you’re running Odyssey on an older Mac. While generation time may be slower, reducing memory usage will help ensure that Odyssey does not crash due to using up too much memory

Inpainting

Inpainting can be done through the Mask (Inpainting) input on your Stable Diffusion node. By connecting a mask to the Mask (Inpainting) input, you control which area of an image is manipulated by the model.

To retain a mask, use the Remove Background node then connect the Mask output to the Mask (Inpainting) input.

An image of inpainting in Odyssey

ControlNet

Different images showing the Mona Lisa with image effects applied

ControlNet is a method of controlling certain regions of an image generated with Stable Diffusion. Odyssey supports a few different methods of leveraging ControlNet through the locally run Stable Diffusion node. These methods can be found in “conditioning.” To learn more about ControlNet, read our tutorial here.

A few notes about using Control Net before we look at each individual option:

  • Like Stable Diffusion, ControlNet models will need to be downloaded. You can find all supported ControlNet options in Settings > Downloads

  • You can select one or multiple ControlNet options by checking off the corresponding checkbox

  • Each ControlNet input has a corresponding node that can then be connected to the ControlNet input

The ControlNet options that Odyssey currently supports are:

  • Canny Edges uses a Canny edge-detection algorithm to show the edges in an image. You can use the Canny node to control which parts of an image that Stable Diffusion will draw into. Canny works well for objects and structured poses, but it can also outline facial features such as wrinkles

  • Holistic edges uses holistically nested edge detection to draw edges in an image with softer, less crisp outlines. HED is considered better at preserving details than canny edges 

  • MLSD corresponds with Odyssey’s “trace edges” node. Trace edges will draw the edges found in an image but, unlike canny or holistic edges, retains the image’s color. You can combine trace edges with a desaturation node to create an image for the MLSD input

  • Scribble can take a drawing and use the drawing to impact the generated image. This is especially impactful when you use your phone or iPad from a blank image node and draw, for example, a smiley face.

  • Line art

  • Line art anime

  • Depth uses the depth map node to take a grayscale image that represents the distance of objects in the original image to the camera. Depth is often considered an enhanced version of image-to-image and can help you synthesize subject and background separately

  • Mask uses the segmentation from the remove background node to keep the output of Stable Diffusion consistent

  • Instruction

  • Tile adds detail to an image that lacks detail. To use Tile, take a portion of an image and then run it into the Tile input on Stable Diffusion. The output will fill in a significant amount of detail

  • QR code uses the QR Monster ControlNet model to engrain QR codes, patterns such as spirals, and text into an image

You can also change the conditioning for ControlNet with the following options:

  • Conditioning start dictates which step your ControlNet input will begin impacting the image

  • Conditioning end dictates which step your ControlNet input will stop impacting the image

  • Conditioning strength determines how much the ControlNet input impacts the steps it impacts

  • Conditioning guidance determines how much the image generation adheres to the ControlNet input. Higher guidance means higher adherence to the input