Image Generation

Image Generation

The Four Best Locally Run Image Models

The Four Best Locally Run Image Models

The Four Best Locally Run Image Models

Aug 27, 2024

|

7

min read

Example images for SDXL
Example images for SDXL
Example images for SDXL

There are a ton of advantages of running AI models locally. While Stable Diffusion has been the gold standard for open source, locally run AI models, there's a range of different Stable Diffusion models, fine tuned custom models, and even competitors emerging.

While sites like CivitAI seem to have a model for everything, let's take a look at our four favorite locally run image models.

Stable Diffusion 1.5

Released in October 2020 and open source, Stable Diffusion 1.5 (SD1.5) became the foundational model the Stable Diffusion ecosystem was built around. At the time, SD1.5 was a state of the art, producing some of the highest quality imagery in the burgeoning space. The real advantage, however, came from the model being open source.

By releasing the model under a The CreativeML OpenRAIL M license, developers could not only download the model but build on top of it. This led to development of pre-processors such as ControlNet which let you completely control the output of an image. Also, because of how long SD1.5 has been around, there are a large number of fine tuned models that use SD1.5 as a base to build a better model, such as:

  • Realistic Vision - an excellent model that improves SD1.5 dramatically. By combining Realistic Vision with ControlNet, you can generate realistic images with incredible control

Images generated using Realistic Vision 5.1

Prompt: RAW photo of a [subject] 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3

  • Dreamshaper - a model that's a bit more "all-purpose" than Realistic Vision since it leans less toward realism than Realistic Vision. The model ranges from 3D to Anime and can similarly leverage ControlNet.

    Images generated using Dreamshaper


    Prompt 1: gorgeous realistic portrait of a half woman half cyborg, terrified cyborg in a bright white room, realism, 8k, UHD

    Prompt 2: gorgeous watercolor of a beautiful pond at sunset, beautiful colors, brushstrokes, incredible quality

    This is why we still think Stable Diffusion 1.5 is one of the best local models you can use - the models run incredible fast locally, there's an ecosystem of custom models (like the ones we support on HuggingFace), and in-depth ControlNet support.

Stable Diffusion XL

Released in July of 2023, SDXL was a big leap in quality over SD1.5. The generic, basic model meant that better images could be generated with much simpler prompts.

Images generated using Stable Diffusion XL

A collection of SDXL images - courtesy of Stable Diffusion

SDXL is also open source - though the ecosystem that’s built around it is still somewhat nascent. Primarily, ControlNet support is still not as robust as it is for SD1.5 - though IP Adapters have begun emerging as the popular way for controlling SDXL images.

One of the biggest upsides has been the release of Turbo and Lightning models. These models drastically reduce the number of steps it takes to generate an image, which makes image generation extremely fast.

A few of our favorite SDXL fine tuned models also have Lightning versions - giving exceptional image quality without a lot of latency:

  • Realistic Vision XL Lightning - the realistic vision series of models have become our go-to models due to their ability to generate people and a variety of photography styles. With the Lightning models, generation time is extremely fast and an image can be generated with just ten steps

    Images generated using Realistic Vision XL Lightning


    Prompt 1: extremely detailed, landscape of an unknown planet, monolith, lake, cloudy weather, unreal engine 5, perfect composition, vibrant, rtx, hbao

    Prompt 2: instagram photo, portrait photo of (man:2.1), 28 y.o, perfect face, natural skin, hard shadows, film grain

  • Juggernaut X Lightning - Juggernaut models are massive and cover a wide array of image types. We like the lightning models here as well, where they can bring image generation down to just 4 steps


    Images generated using Juggernaut X Lightning


    Prompt 1: 3D render of a sci-fi baroque concept design of anatomically correct brain device with terrarium, steampunk, intricate details, scientific, hyper detailed, photorealistic


    Prompt 2: instagram photograph of a bungalow, sunset, ocean, architectural lighting, modern design, wooden walkway, reflection on water, blue hour, clear sky, fujifilm, 35mm

Flux

The Flux series of models took the AI art community by storm and have quickly become one of the most popular locally run models available. Image quality is consistent, spans a wide range of styles, and is able to handle difficult things like hands and text extremely well. Prompt adherence is also an advantage, which puts the Flux series of models squarely at the top for locally run AI models.

Image generated using Flux

Since Flux was built by Black Forest Labs (a well-funded company composed of some of the earliest Stable Diffusion engineers) the architecture is very similar to Stable Diffusion - which means custom versions of Flux are already starting to emerge and even ControlNets.

For the first time, this means that Stable Diffusion has true competition on the market - which we think is only a good sign with all the chaos that Stability AI has experienced over the past few months.

Stable Diffusion 3 (maybe)

The last model we want to highlight is Stable Diffusion 3 - with one big caveat: the jury is still out. When Stability AI first released Stable Diffusion, they released a smaller model that had some major issues dealing with human anatomy. Bodies looked contorted, limbs were often askew, and the model really struggled with the types of generation that earlier models excelled at.

There were also confusing legal terms associated with running the models locally, which is one of the main reasons why Odyssey didn't include Stable Diffusion 3 as an option at launch.

When you ran the larger model via API though, those issues were nowhere to be seen. In fact through the API, Stable Diffusion 3 was state of the art and looked like a technological breakthrough for prompt adherence, generating text, and baseline image quality.

Images generated using Stable Diffusion 3

A collection of SD3 images via API, courtesy of Stability AI

There have been rumors that Stability AI is working on releasing a model update that performs closer to the API version and has updated legal guidelines. If that's the case, we expect SD3 to be right there in the conversation as Flux as the best locally run model available.

Conclusion

Different models have different benefits - and some unknowns. By comparing models across quality, speed, control, and availability (particularly in apps like Odyssey), you can make an informed decision about which locally run image model to use.

Try Odyssey for free

Download for Mac

Subscribe to Odyssey's newsletter

Get the latest from Odyssey delivered directly to your inbox!

Try Odyssey for free

Download for Mac

Share It On: