Runtime Errror?

I’m using the tool above, but this and similar tools are currently giving errors. Does this have any meaning? Can I use this tool offline on my computer? Or do I need to wait for it to fix itself? I’m not very familiar with the technical aspects. Could you please help me with this?

It’s a version compatibility issue. To give a familiar example, it’s similar to a situation where “a specific piece of software stopped working after installing Windows Update” in a broad sense.

Or do I need to wait for it to fix itself?

Yeah. Basically this. Or if you’re comfortable with coding:

  • Fix the code in the Space yourself and use it as a Zero GPU Space for Pro subscribers, or use it locally (however, if it’s designed for Zero GPU Space like that particular Space, some porting work will be required for local use)
  • Report the issue to the creator via the “Discussion” section in the Community tab of the problematic Space. Depending on the creator, this may speed up the fix
  • Find another similar Space that works properly
  • and so on…

Runtime error on AP123/IllusionDiffusion: what it means, likely cause, and practical fixes

The error is meaningful, and it points to a fairly specific kind of breakage.

The important part of the traceback is this:

from huggingface_hub import HfFolder, whoami
ImportError: cannot import name 'HfFolder' from 'huggingface_hub'

In plain terms:

The app is crashing while importing Gradio, before the actual image-generation code runs.

So this is probably not caused by:

  • a bad prompt,
  • a bad input image,
  • a browser problem,
  • a Stable Diffusion model problem,
  • a ControlNet problem,
  • a QR-code / illusion-image logic problem,
  • or a normal GPU out-of-memory error.

The crash happens earlier. Python starts loading the app, imports Hugging Face ZeroGPU support, that imports Gradio, Gradio imports its OAuth code, and then Gradio tries to import HfFolder from huggingface_hub. In the installed version of huggingface_hub, HfFolder is no longer available.

Useful references:


Short diagnosis

The likely cause is:

old Gradio
+
new huggingface_hub
=
ImportError: cannot import name 'HfFolder'

More specifically, the app appears to use an older Gradio runtime while allowing huggingface_hub to be installed without a safe upper version limit.

Older Gradio code expected this to work:

from huggingface_hub import HfFolder, whoami

But newer huggingface_hub versions removed the old HfFolder API. Hugging Face’s v1.0 migration guide explains that several deprecated APIs were removed in v1.0, and Gradio later fixed this class of issue by switching from HfFolder.get_token() to get_token() in Gradio 5.7.1.

Relevant references:


Why this can happen suddenly

A Hugging Face Space is not just the visible Python code. It is a combination of:

  • the Space source files,
  • the YAML block at the top of README.md,
  • the selected SDK,
  • the selected Gradio version,
  • the Python version,
  • requirements.txt,
  • the current package resolver result,
  • the Hugging Face runtime image,
  • the selected hardware,
  • cache state,
  • and, for ZeroGPU Spaces, the spaces / @spaces.GPU runtime behavior.

The Spaces configuration reference says Spaces are configured through the YAML block at the top of README.md, and that sdk_version specifies the Gradio version. The Spaces dependency docs say extra Python packages are installed from requirements.txt.

That means a Space can have an older Gradio version pinned in README.md, while huggingface_hub is left unpinned in requirements.txt.

A typical failure chain looks like this:

1. The Space rebuilds or restarts from a clean environment.
2. The Space keeps using an older Gradio version.
3. `huggingface_hub` is not pinned below 1.0.
4. The package installer resolves a newer `huggingface_hub`.
5. Old Gradio imports `HfFolder`.
6. New `huggingface_hub` does not provide `HfFolder`.
7. The app exits before the image-generation pipeline starts.

This also explains why several similar tools can fail around the same time. Many older Spaces were written against older dependency behavior but rebuild against newer packages.


The most likely fix

There are two main fixes.

Fix A: conservative recovery fix

Keep the older Gradio setup, but pin huggingface_hub below v1.

Add this to requirements.txt:

huggingface_hub<1.0

For this app, the dependency file would look like:

huggingface_hub<1.0
diffusers
transformers
accelerate
xformers
Pillow
qrcode
filelock
--extra-index-url https://download.pytorch.org/whl/cu118
torch

This is the safest first fix because it changes only the library version boundary that caused the import failure.

This is a recovery fix, not a full modernization. It says:

“Keep the old Gradio app working by keeping huggingface_hub on the old compatible API line.”

This is also the kind of fix seen in similar public cases where older apps broke after huggingface_hub 1.x removed HfFolder.

Relevant references:


Fix B: forward migration fix

Upgrade Gradio to a version that no longer uses HfFolder.get_token().

A likely minimum target is:

sdk_version: 5.7.1

or a newer tested Gradio version.

This is cleaner long-term because Gradio 5.7.1 includes the upstream fix:

Use get_token instead of HfFolder.get_token.

Reference:

However, I would not choose this as the first move unless you are maintaining the app. Upgrading Gradio can reveal other compatibility changes in:

  • components,
  • event handlers,
  • OAuth/login behavior,
  • custom CSS/JS behavior,
  • queue behavior,
  • history/profile UI,
  • and ZeroGPU integration.

So the beginner-safe order is:

First:
    add `huggingface_hub<1.0`

Then:
    rebuild and see whether the app starts

Later:
    consider a Gradio upgrade if you want a long-term maintained fork

If these are not your Spaces

If the Spaces are not yours, you cannot directly fix the public page.

You cannot directly:

  • edit requirements.txt,
  • change sdk_version,
  • run Factory rebuild,
  • change hardware,
  • remove OAuth,
  • change ZeroGPU settings,
  • or patch the source.

So your practical options are:

Option Control Difficulty Notes
Wait for the owner/platform to fix it Low Very easy Fine for casual use, unreliable if you need the tool now.
Use another working tool Low Easy Best if you only need similar output.
Duplicate the Space Medium/high Medium Lets you patch dependencies, but GPU/ZeroGPU matters.
Run locally High Medium/hard Best if you have an NVIDIA GPU and are willing to set up Python/CUDA.

Hugging Face’s docs explain that Spaces can be duplicated and configured, but duplicated Spaces generally start from basic CPU hardware unless you select/upgrade hardware.

References:


If you duplicate the Space

If you make your own duplicate, the first change I would make is only this:

huggingface_hub<1.0

Do not upgrade everything at once.

A good first requirements.txt patch would be:

huggingface_hub<1.0
diffusers
transformers
accelerate
xformers
Pillow
qrcode
filelock
--extra-index-url https://download.pytorch.org/whl/cu118
torch

Then rebuild.

If the HfFolder error disappears, the diagnosis was correct.

After that, a second error may appear. That does not mean the first fix was wrong. It means the app finally got past the first startup failure.

Possible second-layer errors include:

  • no GPU available,
  • ZeroGPU quota or timeout,
  • PyTorch / CUDA mismatch,
  • xformers binary mismatch,
  • model download failure,
  • insufficient VRAM,
  • OAuth/login behavior not working,
  • or a Gradio UI compatibility issue.

Important hardware note

This is a diffusion image-generation app. CPU-only execution is probably not a good experience.

Hugging Face ZeroGPU is a special shared-GPU runtime for Spaces. The ZeroGPU docs say it dynamically allocates and releases NVIDIA H200 GPUs for Spaces and uses the @spaces.GPU pattern.

So if you duplicate the app, make sure you understand the hardware:

CPU duplicate:
    probably too slow or unusable for this app

GPU / ZeroGPU duplicate:
    realistic

If you only get free CPU hardware, the app may start but generation may be painfully slow or fail.


If you want to run it locally

You can probably run it locally, but this is only realistic if you have an NVIDIA GPU.

This app is written like a CUDA/GPU app. It loads diffusion components and moves the pipeline to CUDA. Local CPU-only use is likely to be very slow or impractical.

Before trying the app locally, check whether PyTorch can see your GPU:

python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"

You want:

True

If it prints:

False

then do not debug the app yet. Fix your PyTorch/CUDA installation first.

Reference:


Local setup idea

For a conservative local setup, use Python 3.10 and install compatible packages.

Example:

python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip

On Windows PowerShell, activation is usually:

.\venv\Scripts\Activate.ps1

Then install Gradio and the compatible Hub package:

pip install gradio==4.36.1 "huggingface_hub<1.0" diffusers transformers accelerate Pillow qrcode filelock

Install PyTorch separately using the official selector:

I would initially avoid installing xformers unless needed, because xformers can be sensitive to the exact PyTorch/CUDA version. First make the app boot. Optimize later.


Handling import spaces locally

The app uses Hugging Face ZeroGPU-style code:

import spaces

@spaces.GPU
def inference(...):
    ...

That is meant for Hugging Face Spaces / ZeroGPU. Locally, you are not running inside the same ZeroGPU runtime.

If local execution fails with:

ModuleNotFoundError: No module named 'spaces'

you can create a tiny local file named spaces.py next to app.py:

def GPU(func=None, **kwargs):
    def decorator(f):
        return f

    if callable(func):
        return func

    return decorator

This makes @spaces.GPU do nothing locally.

That is fine for a local GPU setup because your local machine already owns the GPU. You do not need Hugging Face’s ZeroGPU allocator.

Reference:


Offline use: what is realistic?

“Offline” is possible only after setup.

The first run needs internet to install packages and download model files. Later runs can use the local Hugging Face cache if all required files are already present.

The Hugging Face Hub download guide explains that Hub downloads are cached locally. The environment variable docs explain that HF_HUB_OFFLINE=1 prevents HTTP calls and uses cached files only; if a needed cached file is missing, it raises an error.

So the realistic model is:

First run:
    internet required
    packages downloaded
    model files downloaded
    cache populated

Later runs:
    can work offline if all required model files are cached

Do not expect this to work offline from a fresh install.

Offline mode examples:

export HF_HUB_OFFLINE=1
python app.py

Windows Command Prompt:

set HF_HUB_OFFLINE=1
python app.py

Windows PowerShell:

$env:HF_HUB_OFFLINE="1"
python app.py

References:


Why I would not rewrite the model-loading code first

The app uses @spaces.GPU, and it may also move the model to CUDA at module/root level.

That can look strange if you are used to ordinary local Python scripts. But for ZeroGPU Spaces, Hugging Face documents a specific execution model around @spaces.GPU. So I would not start by moving model loading into the inference function.

First fix the import/dependency error. Then handle any new runtime error separately.

A safe debugging order is:

1. Fix `HfFolder` import crash.
2. Confirm the app starts.
3. Confirm models download/load.
4. Confirm CUDA/GPU is available.
5. Test one small generation.
6. Only then optimize speed, memory, duration, or UI behavior.

Likely next issues after the HfFolder fix

After adding huggingface_hub<1.0, you may uncover a second error. Common possibilities:

1. ZeroGPU duration / timeout

If generation starts but times out, the app may need a longer ZeroGPU duration.

Example patch:

@spaces.GPU(duration=120)
def inference(...):
    ...

Reference:

2. xformers mismatch

If the next error mentions compiled extensions, CUDA symbols, missing operators, or binary mismatch, xformers may not match the installed PyTorch version.

A beginner-safe tactic is:

Get the app booting without `xformers` first.
Add `xformers` later only if needed.

3. CUDA out of memory

If you see something like:

CUDA out of memory

then your GPU may not have enough VRAM for the current resolution / model / second pass.

Possible mitigations:

  • lower resolution,
  • reduce inference steps,
  • avoid the second 1024x1024 pass,
  • use CPU offload,
  • use a bigger GPU,
  • or use a lighter workflow.

Reference:

4. OAuth / login behavior

The app has Hugging Face OAuth-related behavior. That is part of why Gradio imports OAuth code. Locally, login/profile/history features may not behave the same as on Hugging Face Spaces.

If local generation works but login/history fails, that is a separate issue.

Reference:


My practical recommendation

If you only want to use the tool casually, I would not start with local setup. Use another working public tool or wait.

If you want this exact tool and some control, duplicate the Space and apply the conservative dependency fix:

huggingface_hub<1.0

If you want independence from the public Space, run it locally only if you have an NVIDIA GPU and are willing to set up Python, PyTorch, CUDA, and model downloads.

My ranking:

Lowest effort:
    wait or use another working public tool

Best controlled non-local route:
    duplicate Space + add `huggingface_hub<1.0` + use GPU/ZeroGPU

Best independent route:
    local NVIDIA GPU setup

Worst beginner route:
    trying to run this exact app locally on CPU only

Final summary

  • The error means old Gradio code is importing HfFolder, but the installed huggingface_hub no longer provides it.
  • The app crashes during startup, before image generation begins.
  • The most likely cause is dependency drift: old Gradio plus new huggingface_hub.
  • The safest immediate fix is adding huggingface_hub<1.0 to requirements.txt.
  • The longer-term fix is upgrading Gradio to a version that uses get_token() instead of HfFolder.get_token(), such as Gradio 5.7.1 or newer.
  • If the Space is not yours, you cannot fix the original directly.
  • You can duplicate it, but diffusion needs GPU/ZeroGPU hardware.
  • You can run it locally, but realistically you want an NVIDIA GPU.
  • Offline use means “download once, then run from cache,” not “run from a fresh install without internet.”