Compare commits

..

28 Commits

Author SHA1 Message Date
Kenneth Estanislao d9a5500bdf Merge pull request #1713 from TeachDian/fix-1705-wsl-onnxruntime-gpu 2026-03-29 04:54:34 +08:00
TeachDian 86134b6e1d Fix #1705: Update onnxruntime-gpu requirement to 1.23.2 for WSL compatibility 2026-03-29 04:46:48 +08:00
Kenneth Estanislao 9e6f30c0a4 silenced deprecation 2026-03-27 21:35:27 +08:00
Kenneth Estanislao 97321a740d Update face_analyser.py
320 was over optimized, put back to 640
2026-03-27 21:24:19 +08:00
Kenneth Estanislao f5f7ac7764 Revise README for clarity and formatting
Updated README to remove emoji and clarify GPU support details.
2026-03-23 10:02:50 +08:00
Kenneth Estanislao 77d3492eef Add download link for models in README
Added a section for downloading models from Hugging Face.
2026-03-13 23:39:46 +08:00
Kenneth Estanislao 8e3d6e7c65 Add emoji to project title in README
Just want to add an emoji 😝
2026-03-13 22:17:32 +08:00
Kenneth Estanislao ee9699ee70 Happy 80k!
2.1 Released!

- Face randomizer added!
2026-03-13 22:09:18 +08:00
Kenneth Estanislao 3c8b259a3f Some edits on the UI
- Grouped the face enhancers
- Make the mouth mask just a slider
- Removed the redundant switches
2026-03-13 22:03:28 +08:00
Kenneth Estanislao 30b27c2b71 Update Quick Start section to v2.7 beta 2026-03-12 02:40:52 +08:00
Kenneth Estanislao 0d8f3b1f82 Fix on vulnerability report
https://github.com/hacksider/Deep-Live-Cam/issues/1695
2026-03-06 23:26:48 +08:00
KRSHH 6e9e7addf2 Update press section with recent media mentions 2026-03-03 21:16:56 +05:30
Kenneth Estanislao 0c7e871bfc Merge pull request #1689 from laurigates/pr/base-ui-tooltips
feat(ui): add hover tooltips to all controls
2026-02-28 02:41:07 +08:00
Lauri Gates e340b0da8a feat(ui): add hover tooltips to all controls
Add ToolTip class (modules/ui_tooltip.py) and wire descriptive hover
tooltips onto every button, switch, slider, and dropdown in the main
window. Tooltips appear after a 500ms hover delay and are clamped to
screen bounds.

This requires no new dependencies — ToolTip uses only customtkinter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 21:41:24 +02:00
Kenneth Estanislao d0f81ed755 Merge pull request #1671 from laurigates/pr/fix-macos-camera-enum
fix(macos): replace cv2_enumerate_cameras with safe bounded loop
2026-02-24 14:29:00 +08:00
Kenneth Estanislao de01b28802 Merge pull request #1678 from laurigates/pr/perf-opacity-handling
perf(face-swapper): optimize opacity handling and frame copies
2026-02-24 14:28:17 +08:00
Lauri Gates b645d5e60b fix(macos): replace cv2_enumerate_cameras with safe bounded loop
cv2_enumerate_cameras(CAP_AVFOUNDATION) probes indices 0-99 through
OpenCV's AVFoundation backend, which intermittently segfaults (exit
code 139) when invalid device indices are probed. Replace with a
bounded cv2.VideoCapture loop (range(10)) that safely skips
unavailable indices.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:22:35 +02:00
Kenneth Estanislao 31b3a97003 Merge pull request #1680 from laurigates/pr/perf-float32-buffer-reuse
perf(processing): optimize post-processing with float32 and buffer reuse
2026-02-23 15:13:03 +08:00
Kenneth Estanislao e3b46e83b7 Merge pull request #1669 from laurigates/pr/feat-gpen-enhancers
feat: add GPEN-BFR 256 and 512 ONNX face enhancers
2026-02-23 15:05:44 +08:00
Lauri Gates e93fb95903 perf(processing): optimize post-processing with float32 and buffer reuse
- Replace float64 with float32 in apply_mouth_area() blending masks —
  float32 provides sufficient precision for 8-bit image blending and
  halves memory bandwidth
- Use float32 in apply_mask_area() mask computations
- Vectorize hull padding loop in create_face_mask() (face_masking.py)
  replacing per-point Python loop with NumPy array operations
- Fix apply_color_transfer() to use proper [0,1] LAB conversion —
  cv2.cvtColor with float32 input expects [0,1] range, not [0,255]
- Pre-compute inverse masks to avoid repeated (1.0 - mask) subtraction
- Use np.broadcast_to instead of np.repeat for face mask expansion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:27:31 +02:00
Lauri Gates aabf41050a perf(face-swapper): optimize opacity handling and frame copies
Move opacity calculation before frame copy to skip the copy when
opacity is 1.0 (common case). Add early return path for full opacity.
Clear PREVIOUS_FRAME_RESULT instead of caching when interpolation
is disabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:12:02 +02:00
Lauri Gates e57116de68 feat: add GPEN-BFR 256 and 512 ONNX face enhancers
Add two new face enhancement processors using GPEN-BFR ONNX models
at 256x256 and 512x512 resolutions. Models auto-download on first
use from GitHub releases. Integrates into existing frame processor
pipeline alongside GFPGAN enhancer with UI toggle switches.

- modules/paths.py: Shared path constants module
- modules/processors/frame/_onnx_enhancer.py: ONNX enhancement utilities
- modules/processors/frame/face_enhancer_gpen256.py: GPEN-BFR 256 processor
- modules/processors/frame/face_enhancer_gpen512.py: GPEN-BFR 512 processor
- modules/core.py: Add GPEN choices to --frame-processor CLI arg
- modules/globals.py: Add GPEN entries to fp_ui toggle dict
- modules/ui.py: Add GPEN toggle switches and processing integration

Closes #1663

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 19:39:12 +02:00
Kenneth Estanislao d5338a3eae Update version in README and add contributor 2026-02-23 01:02:22 +08:00
Kenneth Estanislao 7ec3a4be29 Merge pull request #1665 from laurigates/pr/perf-pipeline-threading
perf(ui): decouple face detection from swap in live webcam pipeline
2026-02-23 00:59:22 +08:00
Lauri Gates ca6cba9311 perf(ui): decouple face detection from swap in live webcam pipeline
Add a dedicated detection thread that runs face detection continuously
on the latest captured frame and publishes results to a shared dict.
The processing/swap thread reads cached detection results instead of
running detection inline, so it never blocks on the 15-30ms detection
cost.

Architecture change: 2 threads → 3 threads
  Before: capture → [detect + swap] → display
  After:  capture → swap (uses cached detections) → display
                  ↘ detect (async, writes to shared cache) ↗

Also replaces the blocking while/ROOT.update() display loop with
ROOT.after()-based scheduling, which avoids Tk event loop re-entrancy
issues and UI freezes.

Closes #1664
2026-02-22 18:41:47 +02:00
Kenneth Estanislao d89385457e Merge pull request #1659 from laurigates/pr/fix-tk9-compat
fix(ui): patch CTkOptionMenu for Tk 9.0 compatibility
2026-02-23 00:13:47 +08:00
Kenneth Estanislao b015f0099f Update GFPGANv1.4 download link to ONNX format 2026-02-23 00:03:37 +08:00
Lauri Gates a1722c7b2e fix(ui): patch CTkOptionMenu for Tk 9.0 compatibility
In Tk 9.0, Menu.index("end") returns "" instead of raising TclError
on empty menus. CustomTkinter's DropdownMenu._add_menu_commands
doesn't handle this case, causing a crash when creating CTkOptionMenu
widgets (e.g., the camera selector dropdown).

Add a monkey-patch that guards against the empty-string return value.
2026-02-22 11:59:51 +02:00
18 changed files with 928 additions and 297 deletions
+1
View File
@@ -26,3 +26,4 @@ faceswap/
.vscode/
switch_states.json
/models
install.bat
+18 -21
View File
@@ -1,4 +1,4 @@
<h1 align="center">Deep-Live-Cam 2.0.4c</h1>
<h1 align="center">Deep-Live-Cam 2.1</h1>
<p align="center">
Real-time face swap and video deepfake with a single click and only a single image.
@@ -30,11 +30,11 @@ By using this software, you agree to these terms and commit to using it in a man
Users are expected to use this software responsibly and legally. If using a real person's face, obtain their consent and clearly label any output as a deepfake when sharing online. We are not responsible for end-user actions.
## Exclusive v2.6d Quick Start - Pre-built (Windows/Mac Silicon)
## Exclusive v2.7 beta Quick Start - Pre-built (Windows/Mac Silicon/CPU)
<a href="https://deeplivecam.net/index.php/quickstart"> <img src="media/Download.png" width="285" height="77" />
##### This is the fastest build you can get if you have a discrete NVIDIA or AMD GPU or Mac Silicon, And you'll receive special priority support.
##### This is the fastest build you can get if you have a discrete NVIDIA or AMD GPU, CPU or Mac Silicon, And you'll receive special priority support. 2.7 beta is the best you can have with 30+ extra features than the open source version.
###### These Pre-builts are perfect for non-technical users or those who don't have time to, or can't manually install all the requirements. Just a heads-up: this is an open-source project, so you can also install it manually.
@@ -124,7 +124,7 @@ cd Deep-Live-Cam
**3. Download the Models**
1. [GFPGANv1.4](https://huggingface.co/hacksider/deep-live-cam/resolve/main/GFPGANv1.4.pth)
1. [GFPGANv1.4](https://huggingface.co/hacksider/deep-live-cam/resolve/main/GFPGANv1.4.onnx)
2. [inswapper\_128\_fp16.onnx](https://huggingface.co/hacksider/deep-live-cam/resolve/main/inswapper_128_fp16.onnx)
Place these files in the "**models**" folder.
@@ -309,6 +309,9 @@ python run.py --execution-provider openvino
- Use a screen capture tool like OBS to stream.
- To change the face, select a new source image.
## Download all models in this huggingface link
- [**Download models here**](https://huggingface.co/hacksider/deep-live-cam/tree/main)
## Command Line Arguments (Unmaintained)
```
@@ -338,23 +341,16 @@ Looking for a CLI mode? Using the -s/--source argument will make the run program
## Press
**We are always open to criticism and are ready to improve, that's why we didn't cherry-pick anything.**
- [*"Deep-Live-Cam goes viral, allowing anyone to become a digital doppelganger"*](https://arstechnica.com/information-technology/2024/08/new-ai-tool-enables-real-time-face-swapping-on-webcams-raising-fraud-concerns/) - Ars Technica
- [*"Thanks Deep Live Cam, shapeshifters are among us now"*](https://dataconomy.com/2024/08/15/what-is-deep-live-cam-github-deepfake/) - Dataconomy
- [*"This free AI tool lets you become anyone during video-calls"*](https://www.newsbytesapp.com/news/science/deep-live-cam-ai-impersonation-tool-goes-viral/story) - NewsBytes
- [*"OK, this viral AI live stream software is truly terrifying"*](https://www.creativebloq.com/ai/ok-this-viral-ai-live-stream-software-is-truly-terrifying) - Creative Bloq
- [*"Deepfake AI Tool Lets You Become Anyone in a Video Call With Single Photo"*](https://petapixel.com/2024/08/14/deep-live-cam-deepfake-ai-tool-lets-you-become-anyone-in-a-video-call-with-single-photo-mark-zuckerberg-jd-vance-elon-musk/) - PetaPixel
- [*"Deep-Live-Cam Uses AI to Transform Your Face in Real-Time, Celebrities Included"*](https://www.techeblog.com/deep-live-cam-ai-transform-face/) - TechEBlog
- [*"An AI tool that "makes you look like anyone" during a video call is going viral online"*](https://telegrafi.com/en/a-tool-that-makes-you-look-like-anyone-during-a-video-call-is-going-viral-on-the-Internet/) - Telegrafi
- [*"This Deepfake Tool Turning Images Into Livestreams is Topping the GitHub Charts"*](https://decrypt.co/244565/this-deepfake-tool-turning-images-into-livestreams-is-topping-the-github-charts) - Emerge
- [*"New Real-Time Face-Swapping AI Allows Anyone to Mimic Famous Faces"*](https://www.digitalmusicnews.com/2024/08/15/face-swapping-ai-real-time-mimic/) - Digital Music News
- [*"This real-time webcam deepfake tool raises alarms about the future of identity theft"*](https://www.diyphotography.net/this-real-time-webcam-deepfake-tool-raises-alarms-about-the-future-of-identity-theft/) - DIYPhotography
- [*"That's Crazy, Oh God. That's Fucking Freaky Dude... That's So Wild Dude"*](https://www.youtube.com/watch?time_continue=1074&v=py4Tc-Y8BcY) - SomeOrdinaryGamers
- [*"Alright look look look, now look chat, we can do any face we want to look like chat"*](https://www.youtube.com/live/mFsCe7AIxq8?feature=shared&t=2686) - IShowSpeed
- [*"They do a pretty good job matching poses, expression and even the lighting"*](https://www.youtube.com/watch?v=wnCghLjqv3s&t=551s) - TechLinked (LTT)
- [*"Als Sean Connery an der Redaktionskonferenz teilnahm"*](https://www.golem.de/news/deepfakes-als-sean-connery-an-der-redaktionskonferenz-teilnahm-2408-188172.html) - Golem.de (German)
- [*"What the F***! Why do I look like Vinny Jr? I look exactly like Vinny Jr!? No, this shit is crazy! Bro This is F*** Crazy! "*](https://youtu.be/JbUPRmXRUtE?t=3964) - IShowSpeed
- [**Ars Technica**](https://arstechnica.com/information-technology/2024/08/new-ai-tool-enables-real-time-face-swapping-on-webcams-raising-fraud-concerns/) - *"Deep-Live-Cam goes viral, allowing anyone to become a digital doppelganger"*
- [**Yahoo!**](https://www.yahoo.com/tech/ok-viral-ai-live-stream-080041056.html) - *"OK, this viral AI live stream software is truly terrifying"*
- [**CNN Brasil**](https://www.cnnbrasil.com.br/tecnologia/ia-consegue-clonar-rostos-na-webcam-entenda-funcionamento/) - *"AI can clone faces on webcam; understand how it works"*
- [**Bloomberg Technoz**](https://www.bloombergtechnoz.com/detail-news/71032/kenalan-dengan-teknologi-deep-live-cam-bisa-jadi-alat-menipu) - *"Get to know Deep Live Cam technology, it can be used as a tool for deception."*
- [**TrendMicro**](https://www.trendmicro.com/vinfo/gb/security/news/cyber-attacks/ai-vs-ai-deepfakes-and-ekyc) - *"AI vs AI: DeepFakes and eKYC"*
- [**PetaPixel**](https://petapixel.com/2024/08/14/deep-live-cam-deepfake-ai-tool-lets-you-become-anyone-in-a-video-call-with-single-photo-mark-zuckerberg-jd-vance-elon-musk/) - *"Deepfake AI Tool Lets You Become Anyone in a Video Call With Single Photo"*
- [**SomeOrdinaryGamers**](https://www.youtube.com/watch?time_continue=1074&v=py4Tc-Y8BcY) - *"That's Crazy, Oh God. That's Fucking Freaky Dude... That's So Wild Dude"*
- [**IShowSpeed**](https://www.youtube.com/live/mFsCe7AIxq8?feature=shared&t=2686) - *"Alright look look look, now look chat, we can do any face we want to look like chat"*
- [**TechLinked (Linus Tech Tips)**](https://www.youtube.com/watch?v=wnCghLjqv3s&t=551s) - *"They do a pretty good job matching poses, expression and even the lighting"*
- [**IShowSpeed**](https://youtu.be/JbUPRmXRUtE?t=3964) - *"What the F***! Why do I look like Vinny Jr? I look exactly like Vinny Jr!? No, this shit is crazy! Bro This is F*** Crazy!"*
## Credits
@@ -368,6 +364,7 @@ Looking for a CLI mode? Using the -s/--source argument will make the run program
- [vic4key](https://github.com/vic4key): For supporting/contributing to this project
- [kier007](https://github.com/kier007): for improving the user experience
- [qitianai](https://github.com/qitianai): for multi-lingual support
- [laurigates](https://github.com/laurigates): Decoupling stuffs to make everything faster!
- and [all developers](https://github.com/hacksider/Deep-Live-Cam/graphs/contributors) behind libraries used in this project.
- Footnote: Please be informed that the base author of the code is [s0md3v](https://github.com/s0md3v/roop)
- All the wonderful users who helped make this project go viral by starring the repo ❤️
+4 -6
View File
@@ -39,7 +39,7 @@ def parse_args() -> None:
program.add_argument('-s', '--source', help='select an source image', dest='source_path')
program.add_argument('-t', '--target', help='select an target image or video', dest='target_path')
program.add_argument('-o', '--output', help='select output file or directory', dest='output_path')
program.add_argument('--frame-processor', help='pipeline of frame processors', dest='frame_processor', default=['face_swapper'], choices=['face_swapper', 'face_enhancer'], nargs='+')
program.add_argument('--frame-processor', help='pipeline of frame processors', dest='frame_processor', default=['face_swapper'], choices=['face_swapper', 'face_enhancer', 'face_enhancer_gpen256', 'face_enhancer_gpen512'], nargs='+')
program.add_argument('--keep-fps', help='keep original fps', dest='keep_fps', action='store_true', default=False)
program.add_argument('--keep-audio', help='keep original audio', dest='keep_audio', action='store_true', default=True)
program.add_argument('--keep-frames', help='keep temporary frames', dest='keep_frames', action='store_true', default=False)
@@ -86,11 +86,9 @@ def parse_args() -> None:
modules.globals.execution_threads = args.execution_threads
modules.globals.lang = args.lang
#for ENHANCER tumbler:
if 'face_enhancer' in args.frame_processor:
modules.globals.fp_ui['face_enhancer'] = True
else:
modules.globals.fp_ui['face_enhancer'] = False
#for ENHANCER tumblers:
for enhancer_key in ('face_enhancer', 'face_enhancer_gpen256', 'face_enhancer_gpen512'):
modules.globals.fp_ui[enhancer_key] = enhancer_key in args.frame_processor
# translate deprecated args
if args.source_path_deprecated:
+2 -2
View File
@@ -28,9 +28,9 @@ def get_face_analyser() -> Any:
FACE_ANALYSER = insightface.app.FaceAnalysis(
name='buffalo_l',
providers=modules.globals.execution_providers,
allowed_modules=['detection', 'recognition']
allowed_modules=['detection', 'recognition', 'landmark_2d_106']
)
FACE_ANALYSER.prepare(ctx_id=0, det_size=(320, 320))
FACE_ANALYSER.prepare(ctx_id=0, det_size=(640, 640))
return FACE_ANALYSER
+2 -1
View File
@@ -50,7 +50,7 @@ headless: bool | None = None # Run without UI?
log_level: str = "error" # Logging level (e.g., 'debug', 'info', 'warning', 'error')
# Face Processor UI Toggles (Example)
fp_ui: Dict[str, bool] = {"face_enhancer": False}
fp_ui: Dict[str, bool] = {"face_enhancer": False, "face_enhancer_gpen256": False, "face_enhancer_gpen512": False}
# Face Swapper Specific Options
face_swapper_enabled: bool = True # General toggle for the swapper processor
@@ -63,6 +63,7 @@ show_mouth_mask_box: bool = False # Visualize the mouth mask area (for debuggin
mask_feather_ratio: int = 12 # Denominator for feathering calculation (higher = smaller feather)
mask_down_size: float = 0.1 # Expansion factor for lower lip mask (relative)
mask_size: float = 1.0 # Expansion factor for upper lip mask (relative)
mouth_mask_size: float = 0.0 # Mouth mask size (0-100; 0=off, 100=mouth to chin)
# --- START: Added for Frame Interpolation ---
enable_interpolation: bool = True # Toggle temporal smoothing
+1 -1
View File
@@ -1,3 +1,3 @@
name = 'Deep-Live-Cam'
version = '2.0.3c'
version = '2.1'
edition = 'GitHub Edition'
+6
View File
@@ -0,0 +1,6 @@
"""Shared path constants for the Deep-Live-Cam project."""
import os
ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
MODELS_DIR = os.path.join(ROOT_DIR, "models")
+145
View File
@@ -0,0 +1,145 @@
"""Shared ONNX-based face enhancement utilities for GPEN-BFR models.
Provides session creation, pre/post processing, and the core
enhance-face-via-ONNX pipeline.
"""
import os
import platform
import threading
from typing import Any
import cv2
import numpy as np
import onnxruntime
import modules.globals
IS_APPLE_SILICON = platform.system() == "Darwin" and platform.machine() == "arm64"
# Limit concurrent ONNX calls to avoid VRAM exhaustion on multi-face frames
THREAD_SEMAPHORE = threading.Semaphore(min(max(1, (os.cpu_count() or 1)), 8))
def create_onnx_session(model_path: str) -> onnxruntime.InferenceSession:
"""Create an ONNX Runtime session using the configured execution providers."""
providers = modules.globals.execution_providers
session = onnxruntime.InferenceSession(model_path, providers=providers)
return session
def warmup_session(session: onnxruntime.InferenceSession) -> None:
"""Run a dummy inference pass to trigger JIT / compile caching."""
try:
input_feed = {
inp.name: np.zeros(
[d if isinstance(d, int) and d > 0 else 1 for d in inp.shape],
dtype=np.float32,
)
for inp in session.get_inputs()
}
session.run(None, input_feed)
except Exception as e:
print(f"ONNX enhancer warmup skipped (non-fatal): {e}")
def preprocess_face(face_img: np.ndarray, input_size: int) -> np.ndarray:
"""Resize, normalize, and convert a BGR face crop to ONNX input blob.
GPEN-BFR expects [1, 3, H, W] float32 in RGB, normalized to [-1, 1].
"""
resized = cv2.resize(face_img, (input_size, input_size), interpolation=cv2.INTER_LINEAR)
rgb = cv2.cvtColor(resized, cv2.COLOR_BGR2RGB)
blob = rgb.astype(np.float32) / 255.0 * 2.0 - 1.0
blob = np.transpose(blob, (2, 0, 1))[np.newaxis, ...]
return blob
def postprocess_face(output: np.ndarray) -> np.ndarray:
"""Convert ONNX output [1, 3, H, W] float32 back to BGR uint8 image."""
img = output[0].transpose(1, 2, 0)
img = ((img + 1.0) / 2.0 * 255.0)
img = np.clip(img, 0, 255).astype(np.uint8)
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
return img
def _get_face_affine(face: Any, input_size: int):
"""Compute affine transform to align a face to GPEN input space.
Returns (M, inv_M) — forward and inverse affine matrices.
"""
template = np.array([
[0.31556875, 0.4615741],
[0.68262291, 0.4615741],
[0.50009375, 0.6405054],
[0.34947187, 0.8246919],
[0.65343645, 0.8246919],
], dtype=np.float32) * input_size
landmarks = None
if hasattr(face, "kps") and face.kps is not None:
landmarks = face.kps.astype(np.float32)
elif hasattr(face, "landmark_2d_106") and face.landmark_2d_106 is not None:
lm106 = face.landmark_2d_106
landmarks = np.array([
lm106[38], # left eye
lm106[88], # right eye
lm106[86], # nose tip
lm106[52], # left mouth
lm106[61], # right mouth
], dtype=np.float32)
if landmarks is None or len(landmarks) < 5:
return None, None
M = cv2.estimateAffinePartial2D(landmarks, template, method=cv2.LMEDS)[0]
if M is None:
return None, None
inv_M = cv2.invertAffineTransform(M)
return M, inv_M
def enhance_face_onnx(
frame: np.ndarray,
face: Any,
session: onnxruntime.InferenceSession,
input_size: int,
) -> np.ndarray:
"""Enhance a single face in the frame using an ONNX face restoration model."""
M, inv_M = _get_face_affine(face, input_size)
if M is None:
return frame
face_crop = cv2.warpAffine(
frame, M, (input_size, input_size),
flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE,
)
blob = preprocess_face(face_crop, input_size)
with THREAD_SEMAPHORE:
output = session.run(None, {session.get_inputs()[0].name: blob})[0]
enhanced = postprocess_face(output)
# Create mask for blending (feathered edges)
mask = np.ones((input_size, input_size), dtype=np.float32)
border = max(1, input_size // 16)
mask[:border, :] = np.linspace(0, 1, border)[:, np.newaxis]
mask[-border:, :] = np.linspace(1, 0, border)[:, np.newaxis]
mask[:, :border] = np.minimum(mask[:, :border], np.linspace(0, 1, border)[np.newaxis, :])
mask[:, -border:] = np.minimum(mask[:, -border:], np.linspace(1, 0, border)[np.newaxis, :])
h, w = frame.shape[:2]
warped_enhanced = cv2.warpAffine(
enhanced, inv_M, (w, h),
flags=cv2.INTER_LINEAR, borderValue=(0, 0, 0),
)
warped_mask = cv2.warpAffine(
mask, inv_M, (w, h),
flags=cv2.INTER_LINEAR, borderValue=0,
)
mask_3ch = warped_mask[:, :, np.newaxis]
result = (warped_enhanced.astype(np.float32) * mask_3ch +
frame.astype(np.float32) * (1.0 - mask_3ch))
return np.clip(result, 0, 255).astype(np.uint8)
+9
View File
@@ -17,8 +17,17 @@ FRAME_PROCESSORS_INTERFACE = [
'process_video'
]
ALLOWED_PROCESSORS = {
'face_swapper',
'face_enhancer',
'face_enhancer_gpen256',
'face_enhancer_gpen512'
}
def load_frame_processor_module(frame_processor: str) -> Any:
if frame_processor not in ALLOWED_PROCESSORS:
print(f"Frame processor {frame_processor} is not allowed")
sys.exit()
try:
frame_processor_module = importlib.import_module(f'modules.processors.frame.{frame_processor}')
for method_name in FRAME_PROCESSORS_INTERFACE:
@@ -0,0 +1,125 @@
"""GPEN-BFR-256 face enhancer — ONNX-based face restoration at 256x256."""
from typing import Any, List
import os
import threading
import cv2
import numpy as np
import modules.globals
import modules.processors.frame.core
from modules.core import update_status
from modules.face_analyser import get_one_face
from modules.typing import Frame, Face
from modules.utilities import (
is_image,
is_video,
)
from modules.processors.frame._onnx_enhancer import (
create_onnx_session,
warmup_session,
enhance_face_onnx,
)
NAME = "DLC.FACE-ENHANCER-GPEN256"
INPUT_SIZE = 256
MODEL_URL = "https://github.com/harisreedhar/Face-Upscalers-ONNX/releases/download/GPEN-BFR/GPEN-BFR-256.onnx"
MODEL_FILE = "GPEN-BFR-256.onnx"
ENHANCER = None
THREAD_LOCK = threading.Lock()
abs_dir = os.path.dirname(os.path.abspath(__file__))
models_dir = os.path.join(
os.path.dirname(os.path.dirname(os.path.dirname(abs_dir))), "models"
)
def pre_check() -> bool:
model_path = os.path.join(models_dir, MODEL_FILE)
if not os.path.exists(model_path):
update_status(f"Downloading {MODEL_FILE}...", NAME)
from modules.utilities import conditional_download
conditional_download(models_dir, [MODEL_URL])
return True
def pre_start() -> bool:
if not is_image(modules.globals.target_path) and not is_video(modules.globals.target_path):
update_status("Select an image or video for target path.", NAME)
return False
return True
def get_enhancer() -> Any:
global ENHANCER
with THREAD_LOCK:
if ENHANCER is None:
model_path = os.path.join(models_dir, MODEL_FILE)
if not os.path.exists(model_path):
from modules.utilities import conditional_download
conditional_download(models_dir, [MODEL_URL])
if not os.path.exists(model_path):
raise FileNotFoundError(f"Model file not found: {model_path}")
print(f"{NAME}: Loading ONNX model from {model_path}")
ENHANCER = create_onnx_session(model_path)
warmup_session(ENHANCER)
print(f"{NAME}: Model loaded successfully.")
return ENHANCER
def enhance_face(temp_frame: Frame, face: Face) -> Frame:
try:
session = get_enhancer()
except Exception as e:
print(f"{NAME}: {e}")
return temp_frame
try:
return enhance_face_onnx(temp_frame, face, session, INPUT_SIZE)
except Exception as e:
print(f"{NAME}: Error during face enhancement: {e}")
return temp_frame
def process_frame(source_face: Face | None, temp_frame: Frame) -> Frame:
target_face = get_one_face(temp_frame)
if target_face is None:
return temp_frame
return enhance_face(temp_frame, target_face)
def process_frame_v2(temp_frame: Frame) -> Frame:
target_face = get_one_face(temp_frame)
if target_face:
temp_frame = enhance_face(temp_frame, target_face)
return temp_frame
def process_frames(
source_path: str | None, temp_frame_paths: List[str], progress: Any = None
) -> None:
for temp_frame_path in temp_frame_paths:
temp_frame = cv2.imread(temp_frame_path)
if temp_frame is None:
if progress:
progress.update(1)
continue
result = process_frame(None, temp_frame)
cv2.imwrite(temp_frame_path, result)
if progress:
progress.update(1)
def process_image(source_path: str | None, target_path: str, output_path: str) -> None:
target_frame = cv2.imread(target_path)
if target_frame is None:
print(f"{NAME}: Error: Failed to read target image {target_path}")
return
result_frame = process_frame(None, target_frame)
cv2.imwrite(output_path, result_frame)
print(f"{NAME}: Enhanced image saved to {output_path}")
def process_video(source_path: str | None, temp_frame_paths: List[str]) -> None:
modules.processors.frame.core.process_video(source_path, temp_frame_paths, process_frames)
@@ -0,0 +1,125 @@
"""GPEN-BFR-512 face enhancer — ONNX-based face restoration at 512x512."""
from typing import Any, List
import os
import threading
import cv2
import numpy as np
import modules.globals
import modules.processors.frame.core
from modules.core import update_status
from modules.face_analyser import get_one_face
from modules.typing import Frame, Face
from modules.utilities import (
is_image,
is_video,
)
from modules.processors.frame._onnx_enhancer import (
create_onnx_session,
warmup_session,
enhance_face_onnx,
)
NAME = "DLC.FACE-ENHANCER-GPEN512"
INPUT_SIZE = 512
MODEL_URL = "https://github.com/harisreedhar/Face-Upscalers-ONNX/releases/download/GPEN-BFR/GPEN-BFR-512.onnx"
MODEL_FILE = "GPEN-BFR-512.onnx"
ENHANCER = None
THREAD_LOCK = threading.Lock()
abs_dir = os.path.dirname(os.path.abspath(__file__))
models_dir = os.path.join(
os.path.dirname(os.path.dirname(os.path.dirname(abs_dir))), "models"
)
def pre_check() -> bool:
model_path = os.path.join(models_dir, MODEL_FILE)
if not os.path.exists(model_path):
update_status(f"Downloading {MODEL_FILE}...", NAME)
from modules.utilities import conditional_download
conditional_download(models_dir, [MODEL_URL])
return True
def pre_start() -> bool:
if not is_image(modules.globals.target_path) and not is_video(modules.globals.target_path):
update_status("Select an image or video for target path.", NAME)
return False
return True
def get_enhancer() -> Any:
global ENHANCER
with THREAD_LOCK:
if ENHANCER is None:
model_path = os.path.join(models_dir, MODEL_FILE)
if not os.path.exists(model_path):
from modules.utilities import conditional_download
conditional_download(models_dir, [MODEL_URL])
if not os.path.exists(model_path):
raise FileNotFoundError(f"Model file not found: {model_path}")
print(f"{NAME}: Loading ONNX model from {model_path}")
ENHANCER = create_onnx_session(model_path)
warmup_session(ENHANCER)
print(f"{NAME}: Model loaded successfully.")
return ENHANCER
def enhance_face(temp_frame: Frame, face: Face) -> Frame:
try:
session = get_enhancer()
except Exception as e:
print(f"{NAME}: {e}")
return temp_frame
try:
return enhance_face_onnx(temp_frame, face, session, INPUT_SIZE)
except Exception as e:
print(f"{NAME}: Error during face enhancement: {e}")
return temp_frame
def process_frame(source_face: Face | None, temp_frame: Frame) -> Frame:
target_face = get_one_face(temp_frame)
if target_face is None:
return temp_frame
return enhance_face(temp_frame, target_face)
def process_frame_v2(temp_frame: Frame) -> Frame:
target_face = get_one_face(temp_frame)
if target_face:
temp_frame = enhance_face(temp_frame, target_face)
return temp_frame
def process_frames(
source_path: str | None, temp_frame_paths: List[str], progress: Any = None
) -> None:
for temp_frame_path in temp_frame_paths:
temp_frame = cv2.imread(temp_frame_path)
if temp_frame is None:
if progress:
progress.update(1)
continue
result = process_frame(None, temp_frame)
cv2.imwrite(temp_frame_path, result)
if progress:
progress.update(1)
def process_image(source_path: str | None, target_path: str, output_path: str) -> None:
target_frame = cv2.imread(target_path)
if target_frame is None:
print(f"{NAME}: Error: Failed to read target image {target_path}")
return
result_frame = process_frame(None, target_frame)
cv2.imwrite(output_path, result_frame)
print(f"{NAME}: Enhanced image saved to {output_path}")
def process_video(source_path: str | None, temp_frame_paths: List[str]) -> None:
modules.processors.frame.core.process_video(source_path, temp_frame_paths, process_frames)
+50 -40
View File
@@ -6,24 +6,31 @@ from modules.gpu_processing import gpu_gaussian_blur, gpu_resize, gpu_cvt_color
def apply_color_transfer(source, target):
"""
Apply color transfer from target to source image
Apply color transfer from target to source image using LAB color space.
Uses float32 throughout for performance (sufficient precision for 8-bit images).
"""
source = cv2.cvtColor(source, cv2.COLOR_BGR2LAB).astype("float32")
target = cv2.cvtColor(target, cv2.COLOR_BGR2LAB).astype("float32")
# Convert to float32 [0,1] range for proper LAB conversion
source_f32 = source.astype(np.float32) / 255.0
target_f32 = target.astype(np.float32) / 255.0
source_mean, source_std = cv2.meanStdDev(source)
target_mean, target_std = cv2.meanStdDev(target)
source_lab = cv2.cvtColor(source_f32, cv2.COLOR_BGR2LAB)
target_lab = cv2.cvtColor(target_f32, cv2.COLOR_BGR2LAB)
# Reshape mean and std to be broadcastable
source_mean = source_mean.reshape(1, 1, 3)
source_std = source_std.reshape(1, 1, 3)
target_mean = target_mean.reshape(1, 1, 3)
target_std = target_std.reshape(1, 1, 3)
source_mean, source_std = cv2.meanStdDev(source_lab)
target_mean, target_std = cv2.meanStdDev(target_lab)
# Perform the color transfer
source = (source - source_mean) * (target_std / source_std) + target_mean
# Reshape mean and std to be broadcastable (already float64 from meanStdDev, cast to f32)
source_mean = source_mean.reshape(1, 1, 3).astype(np.float32)
source_std = np.maximum(source_std.reshape(1, 1, 3), 1e-6).astype(np.float32)
target_mean = target_mean.reshape(1, 1, 3).astype(np.float32)
target_std = target_std.reshape(1, 1, 3).astype(np.float32)
return cv2.cvtColor(np.clip(source, 0, 255).astype("uint8"), cv2.COLOR_LAB2BGR)
# Perform the color transfer in LAB space
result_lab = (source_lab - source_mean) * (target_std / source_std) + target_mean
# Convert back to BGR and uint8
result_bgr = cv2.cvtColor(result_lab, cv2.COLOR_LAB2BGR)
return np.clip(result_bgr * 255.0, 0, 255).astype(np.uint8)
def create_face_mask(face: Face, frame: Frame) -> np.ndarray:
mask = np.zeros(frame.shape[:2], dtype=np.uint8)
@@ -48,16 +55,14 @@ def create_face_mask(face: Face, frame: Frame) -> np.ndarray:
# Create a slightly larger convex hull for padding
face_outline = landmarks[0:33]
hull = cv2.convexHull(face_outline)
hull_padded = []
for point in hull:
x, y = point[0]
center = np.mean(face_outline, axis=0)
direction = np.array([x, y]) - center
direction = direction / np.linalg.norm(direction)
padded_point = np.array([x, y]) + direction * padding
hull_padded.append(padded_point)
hull_padded = np.array(hull_padded, dtype=np.int32)
# Vectorized hull padding — expand each point outward from center
center = np.mean(face_outline, axis=0, dtype=np.float32)
hull_pts = hull.reshape(-1, 2).astype(np.float32)
directions = hull_pts - center
norms = np.linalg.norm(directions, axis=1, keepdims=True)
norms = np.maximum(norms, 1e-6) # avoid division by zero
directions /= norms
hull_padded = (hull_pts + directions * padding).astype(np.int32)
# Fill the padded convex hull
cv2.fillConvexPoly(mask, hull_padded, 255)
@@ -77,8 +82,8 @@ def create_lower_mouth_mask(
landmarks = face.landmark_2d_106
if landmarks is not None:
# Use outer mouth landmarks (52-63) to capture the lips only
lower_lip_order = list(range(52, 64))
# Use outer mouth landmarks (52-71) to capture the full mouth area
lower_lip_order = list(range(52, 72))
if max(lower_lip_order) >= landmarks.shape[0]:
return mask, mouth_cutout, mouth_box, lower_lip_polygon
@@ -89,13 +94,16 @@ def create_lower_mouth_mask(
center = np.mean(lower_lip_landmarks, axis=0)
# Expand the landmarks outward using the mouth_mask_size
# Use a more conservative expansion to avoid affecting face shape
expansion_factor = (
1 + modules.globals.mask_down_size * modules.globals.mouth_mask_size
)
expanded_landmarks = (lower_lip_landmarks - center) * expansion_factor + center
mouth_mask_size = getattr(modules.globals, "mouth_mask_size", 0.0) # 0-100 slider
expansion_factor = 1 + (mouth_mask_size / 100.0) * 2.5
# Removed specific top/chin extensions to preserve face shape
# Expand with extra downward bias toward chin
offsets = lower_lip_landmarks - center
chin_bias = 1 + (mouth_mask_size / 100.0) * 1.5
scale_y = np.where(offsets[:, 1] > 0, expansion_factor * chin_bias, expansion_factor)
expanded_landmarks = lower_lip_landmarks.copy()
expanded_landmarks[:, 0] = center[0] + offsets[:, 0] * expansion_factor
expanded_landmarks[:, 1] = center[1] + offsets[:, 1] * scale_y
# Convert back to integer coordinates
expanded_landmarks = expanded_landmarks.astype(np.int32)
@@ -468,26 +476,28 @@ def apply_mask_area(
box_height // modules.globals.mask_feather_ratio,
)
feathered_mask = cv2.GaussianBlur(
polygon_mask.astype(float), (0, 0), feather_amount
polygon_mask.astype(np.float32), (0, 0), feather_amount
)
feathered_mask = feathered_mask / feathered_mask.max()
max_val = feathered_mask.max()
if max_val > 1e-6:
feathered_mask *= np.float32(1.0 / max_val)
# Apply additional smoothing to the mask edges
feathered_mask = cv2.GaussianBlur(feathered_mask, (5, 5), 1)
face_mask_roi = face_mask[min_y:max_y, min_x:max_x]
combined_mask = feathered_mask * (face_mask_roi / 255.0)
combined_mask = feathered_mask * (face_mask_roi.astype(np.float32) * np.float32(1.0 / 255.0))
combined_mask = combined_mask[:, :, np.newaxis]
combined_mask_3ch = combined_mask[:, :, np.newaxis]
inv_mask = np.float32(1.0) - combined_mask_3ch
blended = (
color_corrected_area * combined_mask + roi * (1 - combined_mask)
color_corrected_area * combined_mask_3ch + roi * inv_mask
).astype(np.uint8)
# Apply face mask to blended result
face_mask_3channel = (
np.repeat(face_mask_roi[:, :, np.newaxis], 3, axis=2) / 255.0
)
final_blend = blended * face_mask_3channel + roi * (1 - face_mask_3channel)
face_mask_f32 = face_mask_roi[:, :, np.newaxis].astype(np.float32) * np.float32(1.0 / 255.0)
face_mask_3channel = np.broadcast_to(face_mask_f32, blended.shape)
final_blend = blended * face_mask_3channel + roi * (np.float32(1.0) - face_mask_3channel)
frame[min_y:max_y, min_x:max_x] = final_blend.astype(np.uint8)
except Exception as e:
+56 -102
View File
@@ -136,8 +136,12 @@ def swap_face(source_face: Face, target_face: Face, temp_frame: Frame) -> Frame:
if not hasattr(source_face, 'normed_embedding') or source_face.normed_embedding is None:
return temp_frame
# Store a copy of the original frame before swapping for opacity blending
original_frame = temp_frame.copy()
# Store a copy of the original frame before swapping for opacity blending and mouth mask
opacity = getattr(modules.globals, "opacity", 1.0)
opacity = max(0.0, min(1.0, opacity))
mouth_mask_enabled = getattr(modules.globals, "mouth_mask", False)
# Always copy if mouth mask is enabled (we need the unmodified original for mouth cutout)
original_frame = temp_frame.copy() if (opacity < 1.0 or mouth_mask_enabled) else temp_frame
# Pre-swap Input Check with optimization
if temp_frame.dtype != np.uint8:
@@ -188,28 +192,28 @@ def swap_face(source_face: Face, target_face: Face, temp_frame: Frame) -> Frame:
# --- Post-swap Processing (Masking, Opacity, etc.) ---
# Now, work with the guaranteed uint8 'swapped_frame'
if getattr(modules.globals, "mouth_mask", False): # Check if mouth_mask is enabled
if mouth_mask_enabled: # Check if mouth_mask is enabled
# Create a mask for the target face
face_mask = create_face_mask(target_face, temp_frame) # Use temp_frame (original shape) for mask creation geometry
face_mask = create_face_mask(target_face, original_frame) # Use original_frame for mask creation geometry
# Create the mouth mask using original geometry
# Create the mouth mask using the ORIGINAL frame (before swap) for cutout
mouth_mask, mouth_cutout, mouth_box, lower_lip_polygon = (
create_lower_mouth_mask(target_face, temp_frame) # Use temp_frame (original) for cutout
create_lower_mouth_mask(target_face, original_frame) # Use original_frame for real mouth cutout
)
# Apply the mouth area only if mouth_cutout exists
if mouth_cutout is not None and mouth_box != (0,0,0,0): # Add check for valid box
# Apply mouth area (from original) onto the 'swapped_frame'
if mouth_cutout is not None and mouth_box != (0,0,0,0):
# Apply mouth area (from original) onto the 'swapped_frame'
swapped_frame = apply_mouth_area(
swapped_frame, mouth_cutout, mouth_box, face_mask, lower_lip_polygon
)
# Draw bounding box only while slider is being dragged
if getattr(modules.globals, "show_mouth_mask_box", False):
mouth_mask_data = (mouth_mask, mouth_cutout, mouth_box, lower_lip_polygon)
# Draw visualization on the swapped_frame *before* opacity blending
swapped_frame = draw_mouth_mask_visualization(
swapped_frame, target_face, mouth_mask_data
)
mouth_mask_data = (mouth_mask, mouth_cutout, mouth_box, lower_lip_polygon)
swapped_frame = draw_mouth_mask_visualization(
swapped_frame, target_face, mouth_mask_data
)
# --- Poisson Blending ---
if getattr(modules.globals, "poisson_blend", False):
@@ -240,19 +244,13 @@ def swap_face(source_face: Face, target_face: Face, temp_frame: Frame) -> Frame:
except Exception as e:
print(f"Poisson blending failed: {e}")
# Apply opacity blend between the original frame and the swapped frame
opacity = getattr(modules.globals, "opacity", 1.0)
# Ensure opacity is within valid range [0.0, 1.0]
opacity = max(0.0, min(1.0, opacity))
# Apply opacity blend between the original frame and the swapped frame
if opacity >= 1.0:
return swapped_frame.astype(np.uint8)
# Blend the original_frame with the (potentially mouth-masked) swapped_frame
# Ensure both frames are uint8 before blending
final_swapped_frame = gpu_add_weighted(original_frame.astype(np.uint8), 1 - opacity, swapped_frame.astype(np.uint8), opacity, 0)
# Ensure final frame is uint8 after blending (addWeighted should preserve it, but belt-and-suspenders)
final_swapped_frame = final_swapped_frame.astype(np.uint8)
return final_swapped_frame
return final_swapped_frame.astype(np.uint8)
# --- START: Mac M1-M5 Optimized Face Detection ---
@@ -363,10 +361,8 @@ def apply_post_processing(current_frame: Frame, swapped_face_bboxes: List[np.nda
pass
PREVIOUS_FRAME_RESULT = processed_frame.copy()
else:
# If interpolation is off or weight is invalid, just use the current frame
# Update state with the current (potentially sharpened) frame
# Reset previous frame state if interpolation was just turned off or weight is invalid
PREVIOUS_FRAME_RESULT = processed_frame.copy()
# Interpolation is off or weight is invalid — no need to cache
PREVIOUS_FRAME_RESULT = None
return final_frame
@@ -756,9 +752,9 @@ def create_lower_mouth_mask(
return mask, mouth_cutout, mouth_box, lower_lip_polygon
try: # Wrap main logic in try-except
# Use outer mouth landmarks (52-63) to capture the lips only
# This avoids including the chin/jawline, preserving the face shape from the swap
lower_lip_order = list(range(52, 64))
# Use outer mouth landmarks (52-71) to capture the full mouth area
# This covers both upper and lower lips for proper mouth preservation
lower_lip_order = list(range(52, 72))
# Check if all indices are valid for the loaded landmarks (already partially done by < 106 check)
if max(lower_lip_order) >= landmarks.shape[0]:
@@ -778,9 +774,18 @@ def create_lower_mouth_mask(
return mask, mouth_cutout, mouth_box, lower_lip_polygon
mask_down_size = getattr(modules.globals, "mask_down_size", 0.1) # Default 0.1
expansion_factor = 1 + mask_down_size
expanded_landmarks = (lower_lip_landmarks - center) * expansion_factor + center
mouth_mask_size = getattr(modules.globals, "mouth_mask_size", 0.0) # 0-100 slider
# 0=tight lip outline, 50=covers mouth area, 100=mouth to chin
expansion_factor = 1 + (mouth_mask_size / 100.0) * 2.5
# Expand landmarks from center, with extra downward bias toward chin
offsets = lower_lip_landmarks - center
# Add extra downward expansion for points below center (toward chin)
chin_bias = 1 + (mouth_mask_size / 100.0) * 1.5 # extra vertical stretch downward
scale_y = np.where(offsets[:, 1] > 0, expansion_factor * chin_bias, expansion_factor)
expanded_landmarks = lower_lip_landmarks.copy()
expanded_landmarks[:, 0] = center[0] + offsets[:, 0] * expansion_factor
expanded_landmarks[:, 1] = center[1] + offsets[:, 1] * scale_y
# Ensure landmarks are finite after adjustments
if not np.all(np.isfinite(expanded_landmarks)):
@@ -887,8 +892,8 @@ def draw_mouth_mask_visualization(
print(f"Error drawing polygon for visualization: {e}") # Optional debug
pass
# Optional: Draw bounding box (red rectangle)
# cv2.rectangle(vis_frame, (min_x, min_y), (max_x, max_y), (0, 0, 255), 1)
# Draw bounding box (red rectangle)
cv2.rectangle(vis_frame, (min_x, min_y), (max_x, max_y), (0, 0, 255), 2)
# Optional: Add labels
label_pos_y = min_y - 10 if min_y > 20 else max_y + 15 # Adjust position based on box location
@@ -968,85 +973,34 @@ def apply_mouth_area(
# print("Warning: Mouth cutout is invalid after resize attempt.")
return frame
# --- Color Correction Step ---
# Apply color transfer from ROI (swapped face region) to the original mouth cutout
# This helps match lighting/color before blending
color_corrected_mouth = resized_mouth_cutout # Default to resized if correction fails
try:
# Ensure both images are 3 channels for color transfer
if len(resized_mouth_cutout.shape) == 3 and resized_mouth_cutout.shape[2] == 3 and \
len(roi.shape) == 3 and roi.shape[2] == 3:
color_corrected_mouth = apply_color_transfer(resized_mouth_cutout, roi)
else:
# print("Warning: Cannot apply color transfer, images not BGR.")
pass
except cv2.error as ct_e: # Handle potential errors in color transfer
# print(f"Warning: Color transfer failed: {ct_e}. Using uncorrected mouth cutout.") # Optional debug
pass
except Exception as ct_gen_e:
# print(f"Warning: Unexpected error during color transfer: {ct_gen_e}")
pass
# --- End Color Correction ---
# --- Mask Creation ---
# Create a mask based *specifically* on the mouth_polygon, relative to the ROI
# Create a mask based on the mouth_polygon, relative to the ROI
polygon_mask_roi = np.zeros(roi.shape[:2], dtype=np.uint8)
# Adjust polygon coordinates relative to the ROI's top-left corner
adjusted_polygon = mouth_polygon - [min_x, min_y]
# Draw the filled polygon on the ROI mask
cv2.fillPoly(polygon_mask_roi, [adjusted_polygon.astype(np.int32)], 255)
# Feather the polygon mask (Gaussian blur)
mask_feather_ratio = getattr(modules.globals, "mask_feather_ratio", 12) # Default 12
# Calculate feather amount based on the smaller dimension of the box
feather_base_dim = min(box_width, box_height)
feather_amount = max(1, min(30, feather_base_dim // max(1, mask_feather_ratio))) # Avoid div by zero
# Ensure kernel size is odd and positive
# Feather the edges with Gaussian blur for smooth blending
feather_amount = max(1, min(30, min(box_width, box_height) // 8))
kernel_size = 2 * feather_amount + 1
feathered_polygon_mask = cv2.GaussianBlur(polygon_mask_roi.astype(float), (kernel_size, kernel_size), 0)
feathered_mask = cv2.GaussianBlur(polygon_mask_roi.astype(np.float32), (kernel_size, kernel_size), 0)
# Normalize feathered mask to [0.0, 1.0] range
max_val = feathered_polygon_mask.max()
if max_val > 1e-6: # Avoid division by zero
feathered_polygon_mask = feathered_polygon_mask / max_val
# Normalize to [0.0, 1.0]
max_val = feathered_mask.max()
if max_val > 1e-6:
feathered_mask = feathered_mask / max_val
else:
feathered_polygon_mask.fill(0.0) # Mask is all black if max is near zero
# --- End Mask Creation ---
feathered_mask.fill(0.0)
# --- Refined Blending ---
# Get the corresponding ROI from the *full face mask* (already blurred)
# Ensure face_mask is float and normalized [0.0, 1.0]
if face_mask.dtype != np.float64 and face_mask.dtype != np.float32:
face_mask_float = face_mask.astype(float) / 255.0
else: # Assume already float [0,1] if type is float
face_mask_float = face_mask
face_mask_roi = face_mask_float[min_y:max_y, min_x:max_x]
# Combine the feathered mouth polygon mask with the face mask ROI
# Use minimum to ensure we only affect area inside both masks (mouth area within face)
# This helps blend the edges smoothly with the surrounding swapped face region
combined_mask = np.minimum(feathered_polygon_mask, face_mask_roi)
# Expand mask to 3 channels for blending (ensure it matches image channels)
# --- Blending: paste original mouth onto swapped face ---
if len(frame.shape) == 3 and frame.shape[2] == 3:
combined_mask_3channel = combined_mask[:, :, np.newaxis]
mask_3ch = feathered_mask[:, :, np.newaxis].astype(np.float32)
inv_mask = 1.0 - mask_3ch
# Ensure data types are compatible for blending (float or double for mask, uint8 for images)
color_corrected_mouth_uint8 = color_corrected_mouth.astype(np.uint8)
roi_uint8 = roi.astype(np.uint8)
combined_mask_float = combined_mask_3channel.astype(np.float64) # Use float64 for precision in mask
# Blend: (original_mouth * mask) + (swapped_face * (1 - mask))
blended_roi = (resized_mouth_cutout.astype(np.float32) * mask_3ch +
roi.astype(np.float32) * inv_mask)
# Blend: (original_mouth * combined_mask) + (swapped_face_roi * (1 - combined_mask))
blended_roi = (color_corrected_mouth_uint8 * combined_mask_float +
roi_uint8 * (1.0 - combined_mask_float))
# Place the blended ROI back into the frame
frame[min_y:max_y, min_x:max_x] = blended_roi.astype(np.uint8)
else:
# print("Warning: Cannot apply mouth mask blending, frame is not 3-channel BGR.")
pass # Don't modify frame if it's not BGR
frame[min_y:max_y, min_x:max_x] = np.clip(blended_roi, 0, 255).astype(np.uint8)
except Exception as e:
print(f"Error applying mouth area: {e}") # Optional debug
+289 -115
View File
@@ -3,7 +3,6 @@ import webbrowser
import customtkinter as ctk
from typing import Callable, Tuple
import cv2
from cv2_enumerate_cameras import enumerate_cameras # Add this import
from modules.gpu_processing import gpu_cvt_color, gpu_resize, gpu_flip
from PIL import Image, ImageOps
import time
@@ -11,6 +10,8 @@ import json
import queue
import threading
import numpy as np
import requests
import tempfile
import modules.globals
import modules.metadata
from modules.face_analyser import (
@@ -32,12 +33,36 @@ from modules.utilities import (
)
from modules.video_capture import VideoCapturer
from modules.gettext import LanguageManager
from modules.ui_tooltip import ToolTip
from modules import globals
import platform
if platform.system() == "Windows":
from pygrabber.dshow_graph import FilterGraph
# --- Tk 9.0 compatibility patch ---
# In Tk 9.0, Menu.index("end") returns "" instead of raising TclError
# when the menu is empty. CustomTkinter's CTkOptionMenu doesn't handle
# this, causing crashes. This patch adds the missing guard.
try:
from customtkinter.windows.widgets.core_widget_classes import DropdownMenu as _DropdownMenu
_original_add_menu_commands = _DropdownMenu._add_menu_commands
def _patched_add_menu_commands(self, *args, **kwargs):
try:
end_index = self._menu.index("end")
if end_index == "" or end_index is None:
return
except Exception:
pass
_original_add_menu_commands(self, *args, **kwargs)
_DropdownMenu._add_menu_commands = _patched_add_menu_commands
except (ImportError, AttributeError):
pass # CustomTkinter version doesn't have this class path
# --- End Tk 9.0 patch ---
ROOT = None
POPUP = None
POPUP_LIVE = None
@@ -112,6 +137,7 @@ def save_switch_states():
"show_fps": modules.globals.show_fps,
"mouth_mask": modules.globals.mouth_mask,
"show_mouth_mask_box": modules.globals.show_mouth_mask_box,
"mouth_mask_size": modules.globals.mouth_mask_size,
}
with open("switch_states.json", "w") as f:
json.dump(switch_states, f)
@@ -133,10 +159,10 @@ def load_switch_states():
modules.globals.live_resizable = switch_states.get("live_resizable", False)
modules.globals.fp_ui = switch_states.get("fp_ui", {"face_enhancer": False})
modules.globals.show_fps = switch_states.get("show_fps", False)
modules.globals.mouth_mask = switch_states.get("mouth_mask", False)
modules.globals.show_mouth_mask_box = switch_states.get(
"show_mouth_mask_box", False
)
modules.globals.mouth_mask_size = switch_states.get("mouth_mask_size", 0.0)
# mouth_mask is driven by the slider: on if size > 0, off if 0
modules.globals.mouth_mask = modules.globals.mouth_mask_size > 0
modules.globals.show_mouth_mask_box = False # always start hidden
except FileNotFoundError:
# If the file doesn't exist, use default values
pass
@@ -168,12 +194,20 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
select_face_button = ctk.CTkButton(
root, text=_("Select a face"), cursor="hand2", command=lambda: select_source_path()
)
select_face_button.place(relx=0.1, rely=0.30, relwidth=0.3, relheight=0.1)
select_face_button.place(relx=0.1, rely=0.30, relwidth=0.24, relheight=0.1)
ToolTip(select_face_button, _("Choose the source face image to swap onto the target"))
random_face_button = ctk.CTkButton(
root, text="🔄", cursor="hand2", width=30, command=lambda: fetch_random_face()
)
random_face_button.place(relx=0.35, rely=0.30, relwidth=0.05, relheight=0.1)
ToolTip(random_face_button, _("Get a random face from thispersondoesnotexist.com"))
swap_faces_button = ctk.CTkButton(
root, text="", cursor="hand2", command=lambda: swap_faces_paths()
)
swap_faces_button.place(relx=0.45, rely=0.30, relwidth=0.1, relheight=0.1)
ToolTip(swap_faces_button, _("Swap source and target images"))
select_target_button = ctk.CTkButton(
root,
@@ -182,6 +216,7 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
command=lambda: select_target_path(),
)
select_target_button.place(relx=0.6, rely=0.30, relwidth=0.3, relheight=0.1)
ToolTip(select_target_button, _("Choose the target image or video to apply face swap to"))
keep_fps_value = ctk.BooleanVar(value=modules.globals.keep_fps)
keep_fps_checkbox = ctk.CTkSwitch(
@@ -194,7 +229,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
keep_fps_checkbox.place(relx=0.1, rely=0.5)
keep_fps_checkbox.place(relx=0.1, rely=0.42)
ToolTip(keep_fps_checkbox, _("Output video keeps the original frame rate"))
keep_frames_value = ctk.BooleanVar(value=modules.globals.keep_frames)
keep_frames_switch = ctk.CTkSwitch(
@@ -207,20 +243,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
keep_frames_switch.place(relx=0.1, rely=0.55)
enhancer_value = ctk.BooleanVar(value=modules.globals.fp_ui["face_enhancer"])
enhancer_switch = ctk.CTkSwitch(
root,
text=_("Face Enhancer"),
variable=enhancer_value,
cursor="hand2",
command=lambda: (
update_tumbler("face_enhancer", enhancer_value.get()),
save_switch_states(),
),
)
enhancer_switch.place(relx=0.1, rely=0.6)
keep_frames_switch.place(relx=0.1, rely=0.47)
ToolTip(keep_frames_switch, _("Keep extracted frames on disk after processing"))
keep_audio_value = ctk.BooleanVar(value=modules.globals.keep_audio)
keep_audio_switch = ctk.CTkSwitch(
@@ -233,7 +257,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
keep_audio_switch.place(relx=0.6, rely=0.5)
keep_audio_switch.place(relx=0.6, rely=0.42)
ToolTip(keep_audio_switch, _("Copy audio track from the source video to output"))
many_faces_value = ctk.BooleanVar(value=modules.globals.many_faces)
many_faces_switch = ctk.CTkSwitch(
@@ -246,7 +271,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
many_faces_switch.place(relx=0.6, rely=0.55)
many_faces_switch.place(relx=0.6, rely=0.47)
ToolTip(many_faces_switch, _("Swap every detected face, not just the primary one"))
color_correction_value = ctk.BooleanVar(value=modules.globals.color_correction)
color_correction_switch = ctk.CTkSwitch(
@@ -259,7 +285,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
color_correction_switch.place(relx=0.6, rely=0.6)
color_correction_switch.place(relx=0.6, rely=0.57)
ToolTip(color_correction_switch, _("Fix blue/green color cast from some webcams"))
# nsfw_value = ctk.BooleanVar(value=modules.globals.nsfw_filter)
# nsfw_switch = ctk.CTkSwitch(root, text='NSFW filter', variable=nsfw_value, cursor='hand2', command=lambda: setattr(modules.globals, 'nsfw_filter', nsfw_value.get()))
@@ -277,7 +304,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
close_mapper_window() if not map_faces.get() else None
),
)
map_faces_switch.place(relx=0.1, rely=0.65)
map_faces_switch.place(relx=0.1, rely=0.52)
ToolTip(map_faces_switch, _("Manually assign which source face maps to which target face"))
poisson_blend_value = ctk.BooleanVar(value=modules.globals.poisson_blend)
poisson_blend_switch = ctk.CTkSwitch(
@@ -290,7 +318,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
poisson_blend_switch.place(relx=0.1, rely=0.7)
poisson_blend_switch.place(relx=0.1, rely=0.57)
ToolTip(poisson_blend_switch, _("Blend face edges smoothly using Poisson blending"))
show_fps_value = ctk.BooleanVar(value=modules.globals.show_fps)
show_fps_switch = ctk.CTkSwitch(
@@ -303,48 +332,34 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
save_switch_states(),
),
)
show_fps_switch.place(relx=0.6, rely=0.65)
show_fps_switch.place(relx=0.6, rely=0.52)
ToolTip(show_fps_switch, _("Display frames-per-second counter on the live preview"))
# mouth_mask and show_mouth_mask_box are auto-controlled by the Mouth Mask slider
mouth_mask_var = ctk.BooleanVar(value=modules.globals.mouth_mask)
mouth_mask_switch = ctk.CTkSwitch(
root,
text=_("Mouth Mask"),
variable=mouth_mask_var,
cursor="hand2",
command=lambda: setattr(modules.globals, "mouth_mask", mouth_mask_var.get()),
)
mouth_mask_switch.place(relx=0.1, rely=0.45)
show_mouth_mask_box_var = ctk.BooleanVar(value=modules.globals.show_mouth_mask_box)
show_mouth_mask_box_switch = ctk.CTkSwitch(
root,
text=_("Show Mouth Mask Box"),
variable=show_mouth_mask_box_var,
cursor="hand2",
command=lambda: setattr(
modules.globals, "show_mouth_mask_box", show_mouth_mask_box_var.get()
),
)
show_mouth_mask_box_switch.place(relx=0.6, rely=0.45)
start_button = ctk.CTkButton(
root, text=_("Start"), cursor="hand2", command=lambda: analyze_target(start, root)
)
start_button.place(relx=0.15, rely=0.86, relwidth=0.2, relheight=0.05)
start_button.place(relx=0.15, rely=0.78, relwidth=0.2, relheight=0.04)
ToolTip(start_button, _("Begin processing the target image/video with selected face"))
stop_button = ctk.CTkButton(
root, text=_("Destroy"), cursor="hand2", command=lambda: destroy()
)
stop_button.place(relx=0.4, rely=0.86, relwidth=0.2, relheight=0.05)
stop_button.place(relx=0.4, rely=0.78, relwidth=0.2, relheight=0.04)
ToolTip(stop_button, _("Stop processing and close the application"))
preview_button = ctk.CTkButton(
root, text=_("Preview"), cursor="hand2", command=lambda: toggle_preview()
)
preview_button.place(relx=0.65, rely=0.86, relwidth=0.2, relheight=0.05)
preview_button.place(relx=0.65, rely=0.78, relwidth=0.2, relheight=0.04)
ToolTip(preview_button, _("Show/hide a preview of the processed output"))
# --- Camera Selection ---
camera_label = ctk.CTkLabel(root, text=_("Select Camera:"))
camera_label.place(relx=0.1, rely=0.92, relwidth=0.2, relheight=0.05)
camera_label.place(relx=0.1, rely=0.83, relwidth=0.2, relheight=0.03)
available_cameras = get_available_cameras()
camera_indices, camera_names = available_cameras
@@ -363,7 +378,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
root, variable=camera_variable, values=camera_names
)
camera_optionmenu.place(relx=0.35, rely=0.92, relwidth=0.25, relheight=0.05)
camera_optionmenu.place(relx=0.35, rely=0.83, relwidth=0.25, relheight=0.03)
ToolTip(camera_optionmenu, _("Select which camera to use for live mode"))
live_button = ctk.CTkButton(
root,
@@ -383,9 +399,52 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
else "disabled"
),
)
live_button.place(relx=0.65, rely=0.92, relwidth=0.2, relheight=0.05)
live_button.place(relx=0.65, rely=0.83, relwidth=0.2, relheight=0.03)
ToolTip(live_button, _("Start real-time face swap using webcam"))
# --- End Camera Selection ---
# --- Face Enhancer Dropdown ---
enhancer_options = ["None", "GFPGAN", "GPEN-512", "GPEN-256"]
enhancer_key_map = {
"None": None,
"GFPGAN": "face_enhancer",
"GPEN-512": "face_enhancer_gpen512",
"GPEN-256": "face_enhancer_gpen256",
}
# Determine initial value from current fp_ui state
initial_enhancer = "None"
if modules.globals.fp_ui.get("face_enhancer", False):
initial_enhancer = "GFPGAN"
elif modules.globals.fp_ui.get("face_enhancer_gpen512", False):
initial_enhancer = "GPEN-512"
elif modules.globals.fp_ui.get("face_enhancer_gpen256", False):
initial_enhancer = "GPEN-256"
enhancer_variable = ctk.StringVar(value=initial_enhancer)
def on_enhancer_change(choice: str):
# Disable all enhancers first
for key in ["face_enhancer", "face_enhancer_gpen256", "face_enhancer_gpen512"]:
update_tumbler(key, False)
# Enable the selected one
selected_key = enhancer_key_map.get(choice)
if selected_key:
update_tumbler(selected_key, True)
save_switch_states()
enhancer_label = ctk.CTkLabel(root, text="Face Enhancer:")
enhancer_label.place(relx=0.1, rely=0.62, relwidth=0.2, relheight=0.03)
enhancer_dropdown = ctk.CTkOptionMenu(
root,
variable=enhancer_variable,
values=enhancer_options,
command=on_enhancer_change,
)
enhancer_dropdown.place(relx=0.35, rely=0.62, relwidth=0.3, relheight=0.03)
ToolTip(enhancer_dropdown, _("Select a face enhancement model (None = no enhancement)"))
# 1) Define a DoubleVar for transparency (0 = fully transparent, 1 = fully opaque)
transparency_var = ctk.DoubleVar(value=1.0)
@@ -405,9 +464,9 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
modules.globals.face_swapper_enabled = True
update_status(f"Transparency set to {percentage}%")
# 2) Transparency label and slider (placed ABOVE sharpness)
# 2) Transparency label and slider
transparency_label = ctk.CTkLabel(root, text="Transparency:")
transparency_label.place(relx=0.15, rely=0.75, relwidth=0.2, relheight=0.05)
transparency_label.place(relx=0.15, rely=0.66, relwidth=0.2, relheight=0.03)
transparency_slider = ctk.CTkSlider(
root,
@@ -423,7 +482,8 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
border_width=1,
corner_radius=3,
)
transparency_slider.place(relx=0.35, rely=0.77, relwidth=0.5, relheight=0.02)
transparency_slider.place(relx=0.35, rely=0.67, relwidth=0.5, relheight=0.02)
ToolTip(transparency_slider, _("Blend between original and swapped face (0% = original, 100% = fully swapped)"))
# 3) Sharpness label & slider
sharpness_var = ctk.DoubleVar(value=0.0) # start at 0.0
@@ -432,7 +492,7 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
update_status(f"Sharpness set to {value:.1f}")
sharpness_label = ctk.CTkLabel(root, text="Sharpness:")
sharpness_label.place(relx=0.15, rely=0.80, relwidth=0.2, relheight=0.05)
sharpness_label.place(relx=0.15, rely=0.69, relwidth=0.2, relheight=0.03)
sharpness_slider = ctk.CTkSlider(
root,
@@ -448,17 +508,64 @@ def create_root(start: Callable[[], None], destroy: Callable[[], None]) -> ctk.C
border_width=1,
corner_radius=3,
)
sharpness_slider.place(relx=0.35, rely=0.82, relwidth=0.5, relheight=0.02)
sharpness_slider.place(relx=0.35, rely=0.70, relwidth=0.5, relheight=0.02)
ToolTip(sharpness_slider, _("Sharpen the enhanced face output"))
# 4) Mouth Mask Size slider
mouth_mask_size_var = ctk.DoubleVar(value=modules.globals.mouth_mask_size)
def on_mouth_mask_size_change(value: float):
val = float(value)
modules.globals.mouth_mask_size = val
# Auto-enable/disable mouth mask based on slider position
if val > 0:
modules.globals.mouth_mask = True
mouth_mask_var.set(True)
else:
modules.globals.mouth_mask = False
mouth_mask_var.set(False)
modules.globals.show_mouth_mask_box = False
def on_mouth_mask_slider_release(event):
# Hide bounding box when user releases the slider
modules.globals.show_mouth_mask_box = False
def on_mouth_mask_slider_press(event):
# Show bounding box while dragging
if modules.globals.mouth_mask_size > 0:
modules.globals.show_mouth_mask_box = True
mouth_mask_size_label = ctk.CTkLabel(root, text="Mouth Mask:")
mouth_mask_size_label.place(relx=0.15, rely=0.72, relwidth=0.2, relheight=0.03)
mouth_mask_size_slider = ctk.CTkSlider(
root,
from_=0.0,
to=100.0,
variable=mouth_mask_size_var,
command=on_mouth_mask_size_change,
fg_color="#E0E0E0",
progress_color="#007BFF",
button_color="#FFFFFF",
button_hover_color="#CCCCCC",
height=5,
border_width=1,
corner_radius=3,
)
mouth_mask_size_slider.place(relx=0.35, rely=0.73, relwidth=0.5, relheight=0.02)
mouth_mask_size_slider.bind("<ButtonPress-1>", on_mouth_mask_slider_press)
mouth_mask_size_slider.bind("<ButtonRelease-1>", on_mouth_mask_slider_release)
ToolTip(mouth_mask_size_slider, _("0 = use swapped mouth, 100 = expose original mouth to chin area"))
# Status and link at the bottom
global status_label
status_label = ctk.CTkLabel(root, text=None, justify="center")
status_label.place(relx=0.1, rely=0.96, relwidth=0.8)
status_label.place(relx=0.1, rely=0.75, relwidth=0.8)
donate_label = ctk.CTkLabel(
root, text="Deep Live Cam", justify="center", cursor="hand2"
)
donate_label.place(relx=0.1, rely=0.98, relwidth=0.8)
donate_label.place(relx=0.1, rely=0.87, relwidth=0.8)
donate_label.configure(
text_color=ctk.ThemeManager.theme.get("URL").get("text_color")
)
@@ -667,6 +774,26 @@ def update_tumbler(var: str, value: bool) -> None:
)
def fetch_random_face() -> None:
PREVIEW.withdraw()
try:
response = requests.get(
"https://thispersondoesnotexist.com/",
headers={"User-Agent": "Mozilla/5.0"},
timeout=10,
)
response.raise_for_status()
temp_dir = tempfile.gettempdir()
temp_path = os.path.join(temp_dir, "deep_live_cam_random_face.jpg")
with open(temp_path, "wb") as f:
f.write(response.content)
modules.globals.source_path = temp_path
image = render_image_preview(temp_path, (200, 200))
source_label.configure(image=image)
except Exception as e:
print(f"Failed to fetch random face: {e}")
def select_source_path() -> None:
global RECENT_DIRECTORY_SOURCE, img_ft, vid_ft
@@ -922,21 +1049,13 @@ def get_available_cameras():
camera_indices = []
camera_names = []
if platform.system() == "Darwin": # macOS specific handling
# Try to open the default FaceTime camera first
cap = cv2.VideoCapture(0)
if cap.isOpened():
camera_indices.append(0)
camera_names.append("FaceTime Camera")
cap.release()
# On macOS, additional cameras typically use indices 1 and 2
for i in [1, 2]:
cap = cv2.VideoCapture(i)
if cap.isOpened():
camera_indices.append(i)
camera_names.append(f"Camera {i}")
cap.release()
if platform.system() == "Darwin":
# Do NOT probe cameras with cv2.VideoCapture on macOS — probing
# invalid indices triggers the OBSENSOR backend and causes SIGSEGV.
# Default to indices 0 and 1 (covers FaceTime + one USB camera).
# The user can select the correct index from the UI dropdown.
camera_indices = [0, 1]
camera_names = ["Camera 0", "Camera 1"]
else:
# Linux camera detection - test first 10 indices
for i in range(10):
@@ -974,28 +1093,48 @@ def _capture_thread_func(cap, capture_queue, stop_event):
pass
# How often to run full face detection. On intermediate frames the last
# detected face positions are reused, which significantly reduces the
# per-frame cost of the processing thread.
DETECT_EVERY_N = 2
def _detection_thread_func(latest_frame_holder, detection_result, detection_lock, stop_event):
"""Detection thread: continuously runs face detection on the latest
captured frame and stores results in detection_result under detection_lock.
This decouples face detection (~15-30ms) from face swapping (~5-10ms)
so the swap loop never blocks on detection, significantly improving
live mode FPS."""
while not stop_event.is_set():
with detection_lock:
frame = latest_frame_holder[0]
if frame is None:
time.sleep(0.005)
continue
if modules.globals.many_faces:
many = get_many_faces(frame)
with detection_lock:
detection_result['target_face'] = None
detection_result['many_faces'] = many
else:
face = get_one_face(frame)
with detection_lock:
detection_result['target_face'] = face
detection_result['many_faces'] = None
def _processing_thread_func(capture_queue, processed_queue, stop_event):
"""Processing thread: takes raw frames from capture_queue, applies face
processing, and puts results into processed_queue. Drops processed frames
when the output queue is full so the UI always gets the latest result.
def _processing_thread_func(capture_queue, processed_queue, stop_event,
latest_frame_holder, detection_result, detection_lock):
"""Processing thread: takes raw frames from capture_queue, reads the
latest detection result from the shared detection_result dict, applies
face swap/enhancement, and puts results into processed_queue.
Uses DETECT_EVERY_N to skip expensive face detection on intermediate
frames, reusing cached face positions instead."""
Face detection runs concurrently in _detection_thread_func — this thread
only reads cached results so it never blocks on detection."""
frame_processors = get_frame_processors_modules(modules.globals.frame_processors)
source_image = None
last_source_path = None
prev_time = time.time()
fps_update_interval = 0.5
frame_count = 0
fps = 0
proc_frame_index = 0
cached_target_face = None # cached single-face result
cached_many_faces = None # cached many-faces result
while not stop_event.is_set():
try:
@@ -1003,32 +1142,37 @@ def _processing_thread_func(capture_queue, processed_queue, stop_event):
except queue.Empty:
continue
temp_frame = frame.copy()
run_detection = (proc_frame_index % DETECT_EVERY_N == 0)
proc_frame_index += 1
temp_frame = frame
if modules.globals.live_mirror:
temp_frame = gpu_flip(temp_frame, 1)
# Publish the mirrored frame for the detection thread to pick up
with detection_lock:
latest_frame_holder[0] = temp_frame
if not modules.globals.map_faces:
if source_image is None and modules.globals.source_path:
if modules.globals.source_path and modules.globals.source_path != last_source_path:
last_source_path = modules.globals.source_path
source_image = get_one_face(cv2.imread(modules.globals.source_path))
# Update face detection cache on detection frames
if run_detection or (cached_target_face is None and cached_many_faces is None):
if modules.globals.many_faces:
cached_many_faces = get_many_faces(temp_frame)
cached_target_face = None
else:
cached_target_face = get_one_face(temp_frame)
cached_many_faces = None
# Read latest detection results (brief lock to avoid blocking detection thread)
with detection_lock:
cached_target_face = detection_result.get('target_face')
cached_many_faces = detection_result.get('many_faces')
for frame_processor in frame_processors:
if frame_processor.NAME == "DLC.FACE-ENHANCER":
if modules.globals.fp_ui["face_enhancer"]:
temp_frame = frame_processor.process_frame(None, temp_frame)
elif frame_processor.NAME == "DLC.FACE-ENHANCER-GPEN256":
if modules.globals.fp_ui.get("face_enhancer_gpen256", False):
temp_frame = frame_processor.process_frame(None, temp_frame)
elif frame_processor.NAME == "DLC.FACE-ENHANCER-GPEN512":
if modules.globals.fp_ui.get("face_enhancer_gpen512", False):
temp_frame = frame_processor.process_frame(None, temp_frame)
elif frame_processor.NAME == "DLC.FACE-SWAPPER":
# Use cached face positions to skip redundant detection
# Use cached face positions from detection thread
swapped_bboxes = []
if modules.globals.many_faces and cached_many_faces:
result = temp_frame.copy()
@@ -1051,6 +1195,10 @@ def _processing_thread_func(capture_queue, processed_queue, stop_event):
if frame_processor.NAME == "DLC.FACE-ENHANCER":
if modules.globals.fp_ui["face_enhancer"]:
temp_frame = frame_processor.process_frame_v2(temp_frame)
elif frame_processor.NAME in ("DLC.FACE-ENHANCER-GPEN256", "DLC.FACE-ENHANCER-GPEN512"):
fp_key = frame_processor.NAME.split(".")[-1].lower().replace("-", "_")
if modules.globals.fp_ui.get(fp_key, False):
temp_frame = frame_processor.process_frame_v2(temp_frame)
else:
temp_frame = frame_processor.process_frame_v2(temp_frame)
@@ -1104,6 +1252,14 @@ def create_webcam_preview(camera_index: int):
processed_queue = queue.Queue(maxsize=2)
stop_event = threading.Event()
# Shared state for the detection pipeline.
# latest_frame_holder[0] is the most recent raw frame for the detection
# thread; detection_result holds the last detected faces for the
# processing thread to read. Both are guarded by detection_lock.
detection_lock = threading.Lock()
latest_frame_holder = [None]
detection_result = {'target_face': None, 'many_faces': None}
# Start capture thread
cap_thread = threading.Thread(
target=_capture_thread_func,
@@ -1112,21 +1268,45 @@ def create_webcam_preview(camera_index: int):
)
cap_thread.start()
# Start detection thread — runs face detection asynchronously so the
# processing/swap thread never blocks on it
det_thread = threading.Thread(
target=_detection_thread_func,
args=(latest_frame_holder, detection_result, detection_lock, stop_event),
daemon=True,
)
det_thread.start()
# Start processing thread
proc_thread = threading.Thread(
target=_processing_thread_func,
args=(capture_queue, processed_queue, stop_event),
args=(capture_queue, processed_queue, stop_event,
latest_frame_holder, detection_result, detection_lock),
daemon=True,
)
proc_thread.start()
# Main (UI) thread: pull processed frames and update the display
while not stop_event.is_set():
# Cleanup helper called from the display loop when preview closes
def _cleanup():
stop_event.set()
cap_thread.join(timeout=2.0)
det_thread.join(timeout=2.0)
proc_thread.join(timeout=2.0)
cap.release()
PREVIEW.withdraw()
# Non-blocking display loop using ROOT.after() — avoids blocking the
# Tk event loop which could cause UI freezes or re-entrancy issues
def _display_next_frame():
if stop_event.is_set() or PREVIEW.state() == "withdrawn":
_cleanup()
return
try:
temp_frame = processed_queue.get(timeout=0.03)
temp_frame = processed_queue.get_nowait()
except queue.Empty:
ROOT.update()
continue
ROOT.after(16, _display_next_frame)
return
if modules.globals.live_resizable:
temp_frame = fit_image_to_size(
@@ -1144,17 +1324,11 @@ def create_webcam_preview(camera_index: int):
)
image = ctk.CTkImage(image, size=image.size)
preview_label.configure(image=image)
ROOT.update()
if PREVIEW.state() == "withdrawn":
break
ROOT.after(16, _display_next_frame)
# Signal threads to stop and wait for them
stop_event.set()
cap_thread.join(timeout=2.0)
proc_thread.join(timeout=2.0)
cap.release()
PREVIEW.withdraw()
# Kick off the non-blocking display loop
ROOT.after(0, _display_next_frame)
def create_source_target_popup_for_webcam(
+74
View File
@@ -0,0 +1,74 @@
"""Lightweight hover tooltip for CustomTkinter widgets."""
import customtkinter as ctk
class ToolTip:
"""Show a floating tooltip popup when the user hovers over a widget.
Usage:
ToolTip(my_button, "Helpful description text")
"""
def __init__(self, widget: ctk.CTkBaseClass, text: str, delay: int = 500):
self._widget = widget
self._text = text
self._delay = delay
self._tooltip_window = None
self._after_id = None
widget.bind("<Enter>", self._schedule_show, add="+")
widget.bind("<Leave>", self._hide, add="+")
def _schedule_show(self, event=None):
self._cancel()
self._after_id = self._widget.after(self._delay, self._show)
def _show(self):
if self._tooltip_window is not None:
return
x = self._widget.winfo_rootx() + 20
y = self._widget.winfo_rooty() + self._widget.winfo_height() + 5
self._tooltip_window = tw = ctk.CTkToplevel(self._widget)
tw.withdraw()
tw.overrideredirect(True)
label = ctk.CTkLabel(
tw,
text=self._text,
fg_color="#333333",
text_color="#EEEEEE",
corner_radius=6,
padx=8,
pady=4,
)
label.pack()
tw.update_idletasks()
# Clamp to screen bounds
screen_w = tw.winfo_screenwidth()
screen_h = tw.winfo_screenheight()
tip_w = tw.winfo_reqwidth()
tip_h = tw.winfo_reqheight()
if x + tip_w > screen_w:
x = screen_w - tip_w - 5
if y + tip_h > screen_h:
y = self._widget.winfo_rooty() - tip_h - 5
tw.geometry(f"+{x}+{y}")
tw.deiconify()
def _hide(self, event=None):
self._cancel()
if self._tooltip_window is not None:
self._tooltip_window.destroy()
self._tooltip_window = None
def _cancel(self):
if self._after_id is not None:
self._widget.after_cancel(self._after_id)
self._after_id = None
+16 -7
View File
@@ -15,10 +15,6 @@ import modules.globals
TEMP_FILE = "temp.mp4"
TEMP_DIRECTORY = "temp"
# monkey patch ssl for mac
if platform.system().lower() == "darwin":
ssl._create_default_https_context = ssl._create_unverified_context
def run_ffmpeg(args: List[str]) -> bool:
"""Run ffmpeg with hardware acceleration and optimized settings."""
@@ -286,8 +282,15 @@ def conditional_download(download_directory_path: str, urls: List[str]) -> None:
download_directory_path, os.path.basename(url)
)
if not os.path.exists(download_file_path):
request = urllib.request.urlopen(url) # type: ignore[attr-defined]
total = int(request.headers.get("Content-Length", 0))
request = urllib.request.Request(url)
# Create a specific SSL context for macOS to avoid globally disabling verification
ctx = None
if platform.system().lower() == "darwin":
ctx = ssl._create_unverified_context()
response = urllib.request.urlopen(request, context=ctx)
total = int(response.headers.get("Content-Length", 0))
with tqdm(
total=total,
desc="Downloading",
@@ -295,7 +298,13 @@ def conditional_download(download_directory_path: str, urls: List[str]) -> None:
unit_scale=True,
unit_divisor=1024,
) as progress:
urllib.request.urlretrieve(url, download_file_path, reporthook=lambda count, block_size, total_size: progress.update(block_size)) # type: ignore[attr-defined]
with open(download_file_path, "wb") as f:
while True:
buffer = response.read(8192)
if not buffer:
break
f.write(buffer)
progress.update(len(buffer))
def resolve_relative_path(path: str) -> str:
+2 -2
View File
@@ -9,8 +9,8 @@ tk==0.1.0
customtkinter==5.2.2
pillow==12.1.1
onnxruntime-silicon==1.16.3; sys_platform == 'darwin' and platform_machine == 'arm64'
onnxruntime-gpu==1.24.2; sys_platform != 'darwin'
onnxruntime-gpu==1.23.2; sys_platform != 'darwin'
tensorflow; sys_platform != 'darwin'
opennsfw2==0.10.2
protobuf==5.29.6
protobuf==4.25.1
pygrabber
+3
View File
@@ -1,3 +1,6 @@
import os
os.environ.setdefault('TK_SILENCE_DEPRECATION', '1')
import tkinter
# Only needs to be imported once at the beginning of the application