Project Overview
When digitizing physical photo albums using a flatbed scanner, a common bottleneck is the manual effort required to crop, deskew, and save individual prints from batch scan sheets. This project is a robust, production-grade Python solution designed to automate this workflow.
It combines classical computer vision pipelines with a multi-threaded desktop GUI dashboard, flexible bulk CLI tools, and automated compilation workflows. By leveraging advanced image segmentation and perspective transformations, it automatically extracts, cleans, and dates scanned photos with minimal human intervention.
Technical Pipeline & Architecture
1. Computer Vision Segmentation Pipeline (OpenCV)
The core extraction script (scanner_clipper.py) implements a comprehensive image processing pipeline to detect photographic boundaries on highly reflective white flatbed scanner backings:
- Noise Reduction: Applies a $5 \times 5$ Gaussian Blur kernel (
cv2.GaussianBlur) to remove high-frequency physical scanner bed noise and dust. - Dual-Path Segmentation:
- Path A (Adaptive Thresholding): Utilizes Otsu’s adaptive thresholding method (
cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV) to segment darker print boundaries from the bright scanner bed. - Path B (Canny Edge Detection): Runs a parallel Canny edge detector (
cv2.Canny) as a fallback to ensure low-contrast prints or images with light-colored borders are not missed.
- Path A (Adaptive Thresholding): Utilizes Otsu’s adaptive thresholding method (
- Morphological Cleanup: Integrates structural rectangular elements (
cv2.getStructuringElement) executing sequential morphological Closing and Opening operations. This joins fractured boundaries, bridges gaps, and filters out isolated artifacts. - Rotated Contour Analysis: Detects boundaries via external contour retrieval (
cv2.findContours). Candidate contours are filtered through three geometric heuristics:- Area Ratio: Contours must occupy between 0.5% and 80% of the total scanned sheet area.
- Aspect Ratio: Eliminates overly thin contours (e.g. ratios $> 5.0$).
- Fill Factor: Calculates the ratio of the contour area to its minimum bounding box; contours with a fill factor below $40%$ are rejected as non-rectangular noise.
- Non-Maximum Overlap Suppression: Prunes nested or overlapping child contours by checking intersections between rotated minimum bounding rectangles.
graph TD
A[Raw Scanned Image] --> B[Gaussian Blur Kernel 5x5]
B --> C1[Otsu Adaptive Thresholding]
B --> C2[Canny Edge Detection]
C1 --> D[Bitwise OR Combination]
C2 --> D
D --> E[Morphological Closing & Opening]
E --> F[Rotated Contour Extraction]
F --> G[Geometric & Overlap Filtering]
G --> H[Perspective Transform & Deskew]
H --> I[Whitespace Trimming & Shave]
I --> J[Final WebP / JPEG Output]
2. Perspective Transform & Edge Cleanup
Once valid photo boundaries are established, the pipeline extracts and rectifies the target prints:
- Deskewing & Rectification: Computes the rotated minimum bounding box coordinates (
cv2.minAreaRect), orders the four vertices top-left to bottom-left using distance vectors, and calculates exact target width/height using Euclidean norms. It maps coordinates usingcv2.getPerspectiveTransformand performs a bilinear interpolation transform (cv2.warpPerspective) to extract a perfectly de-rotated rectangular image. - Rounded Corner Shaving: Vintage prints often have physical rounded corners, which leave triangular white scanner artifacts when boxed. The utility applies a custom boundary “shaving” algorithm that crops a configurable number of pixels inward from all four final borders, guaranteeing clean, borderless edges.
3. Chronological EXIF Metadata Engine
Scanned photos completely lack original time metadata, breaking modern chronological search. The exif_date_editor.py tool restores this:
- Robust Date Parsing: Leverages regex patterns to extract dates in multiple formats (e.g. ISO
YYYY-MM-DD, Year-MonthYYYY-MM, Single YearYYYY, or written formats likeAugust 1995). - Sequential Incrementing: When applying an album date to a folder, the engine sequentially increments timestamps by a user-defined step (e.g., 60 seconds) to ensure that the file sort ordering remains exactly aligned with physical album layout order.
- Daylight Randomization: Optionally shifts starting timestamps to random daylight hours (between 9:00 AM and 5:00 PM) to make metadata timestamps look realistic.
- Low-Level Tag Writing: Interacts with binary image formats via Pillow’s EXIF bindings, modifying
DateTime(base tag 306),DateTimeOriginal(EXIF IFD tag 36867), andDateTimeDigitized(EXIF IFD tag 36868) safely through a tempfile replace process to avoid file corruption.
Desktop Dashboard UI
To democratize these terminal utilities, the project features a custom-designed, dark-themed native desktop client built in pure Python Tkinter (gui.py):
- Multithreaded Execution: Utilizes background worker threads (
threading.Thread) and thread-safe communication queues (queue.Queue) to execute CLI pipelines asynchronously, ensuring the UI remains perfectly responsive during massive batch operations. - Real-Time Log Terminal: Implements a custom stdout/stderr redirector stream that prints CLI execution logs in real-time to an integrated scrolled text terminal with colorized syntax tag highlights (blue logs, green success markers, and red runtime errors).
- Interactive Visual Walkthrough Wizard: An interactive workflow that helps users process physical albums folder-by-folder:
- Dynamically traverses local directory trees to locate directories containing images.
- Generates live, grid-based image previews by reading and resizing images using OpenCV, converting them to RGB, and base64-encoding them into native Tkinter-friendly image frames.
- Performs regular expression parsing on folder names to automatically guess the album’s year/month and pre-populate the date entry fields.
- Provides standard Apply & Next, Skip, and End wizard controls.
Continuous Integration & DevOps
The project is structured to transition effortlessly from development to deployment:
- Standalone Binary Compilations: Leverages
PyInstallerto compile Python code, underlying compiled OpenCV libraries, and dependency modules into single, highly optimized, self-contained native executable files (PhotoUtils) for Linux, macOS, and Windows. - GitHub Actions CI/CD Release Pipeline: Runs a production CI configuration (
.github/workflows/release.yml) built on a multi-operating system matrix (ubuntu-latest,windows-latest,macos-latest). Whenever a tag is pushed, it automatically provisions runners, compiles binaries, wraps macOS apps in zip bundles, and publishes them as downloadable assets to a new GitHub Release. - Virtual Environments: Fully integrated with
uvfor lightning-fast virtual environment management and repeatable lockfile-based dependency loading.
Key Achievements
- Classical CV Excellence: Replaced tedious manual cropping tools with high-speed automated contour segmentation and perspective warping, handling hundreds of scans in minutes.
- Responsive Architecture: Built a completely responsive Tkinter interface by separating CPU-heavy image calculations from the main GUI thread.
- Robust Metadata Restoration: Created a flexible sequential time-increment EXIF dating engine that resolves sorting limitations in cloud storage photo apps like Apple Photos and Google Photos.
- Cross-Platform Delivery: Designed an automated CI build workflow that packages complex headless Linux/OpenCV environments and compiled Windows assemblies into zero-install desktop executables.