Technology

59243 readers

3452 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

Introducing Stable-Diffusion.cpp (Inference in Pure C/C++) (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/technology@lemmy.world

8 comments fedilink hide all child comments

cross-posted from: https://lemmy.world/post/3549390

stable-diffusion.cpp

Introducing stable-diffusion.cpp, a pure C/C++ inference engine for Stable Diffusion! This is a really awesome implementation to help speed up home inference of diffusion models.

Tailored for developers and AI enthusiasts, this repository offers a high-performance solution for creating and manipulating images using various quantization techniques and accelerated inference.

https://github.com/leejet/stable-diffusion.cpp

Key Features:

Efficient Implementation: Utilizing plain C/C++, it operates seamlessly like llama.cpp and is built on the ggml framework.

Multiple Precision Support: Choose between 16-bit, 32-bit float, and 4-bit to 8-bit integer quantization.

Optimized Performance: Experience memory-efficient CPU inference with AVX, AVX2, and AVX512 support for x86 architectures.

Versatile Modes: From original txt2img to img2img modes and negative prompt handling, customize your processing needs.

Cross-Platform Compatibility: Runs smoothly on Linux, Mac OS, and Windows.

Getting Started

Cloning, building, and running are made simple, and detailed examples are provided for both text-to-image and image-to-image generation. With an array of options for precision and comprehensive usage guidelines, you can easily adapt the code for your specific project requirements.
git clone --recursive https://github.com/leejet/stable-diffusion.cpp
cd stable-diffusion.cpp
If you have already cloned the repository, you can use the following command to update the repository to the latest code.
cd stable-diffusion.cpp
git pull origin master
git submodule update

More Details

Plain C/C++ implementation based on ggml, working in the same way as llama.cpp

16-bit, 32-bit float support

4-bit, 5-bit and 8-bit integer quantization support

Accelerated memory-efficient CPU inference

Only requires ~2.3GB when using txt2img with fp16 precision to generate a 512x512 image

AVX, AVX2 and AVX512 support for x86 architectures

Original txt2img and img2img mode

Negative prompt

stable-diffusion-webui style tokenizer (not all the features, only token weighting for now)

Sampling method

Euler A

Supported platforms

Linux

Mac OS

Windows

This is a really exciting repo. I'll be honest, I don't think I am as well versed in what's going on for diffusion inference - but I do know more efficient and effective methods running those models are always welcome by people frequently using diffusers. Especially for those who need to multi-task and maintain performance headroom.

you are viewing a single comment's thread
view the rest of the comments

[–] olicvb@lemmy.ca 6 points 1 year ago* (last edited 1 year ago) (1 children)

Got a 1.42s generation on the Cpp one and 2.1s with auto1111's SD (note my torch is outdated, model was converted to fp16).

Though i'm having trouble finding the generated image 😅.

All on the same generation settings, 5800x cpu & 3080 12gb

[–] JackGreenEarth@lemm.ee 1 points 1 year ago (2 children)

I'd love a 2s generation, it usually takes about 60s with my 1660ti 4gb

[–] olicvb@lemmy.ca 3 points 1 year ago (1 children)

oh that's rough, this was on 512x512 20 steps. I usually do 768x768 50 steps for 33 seconds, gets me better quality without using upscale.

[–] JackGreenEarth@lemm.ee 3 points 1 year ago

768x768 50 steps takes me several minutes

[–] Fubarberry@sopuli.xyz 1 points 1 year ago (1 children)

Are you sure that it's using your GPU? That seems way slower than I'd expect.

[–] JackGreenEarth@lemm.ee 1 points 1 year ago

I'm using it on low, because is runs out of memory whenever I try medium or high