Back to Projects
Computer Vision

Hybrid Images

A recreation of the classic optical illusion where a single image changes interpretation based on viewing distance, built from first principles.

Project Details

This project recreates the classic hybrid image technique from Oliva, Torralba, and Schyns’ SIGGRAPH 2006 paper. By filtering one image to keep low-frequency data (broad shapes) and another to keep high-frequency data (fine details), and then combining them, a single static image is produced that reveals different content depending on whether it's viewed up close or from a distance. The implementation was done from first principles in Python with NumPy to demonstrate a deep understanding of the underlying image processing concepts.
hbheader

Look at the image from very close, and then from very far. What do you see?

Hybrid images are static pictures that change interpretation based on viewing distance. Up close, your vision latches onto high frequencies like edges, fine detail, and texture. From far away, those details fade, leaving low frequencies like smooth shading and broad shapes to dominate perception. This project recreates the technique from Oliva, Torralba, and Schyns’ SIGGRAPH 2006 work on hybrid images by explicitly building the filtering pipeline that makes that perceptual swap possible.

Hybrid image result Move closer and further from the screen to see the effect!

Project Details

This project recreates the SIGGRAPH 2006 hybrid image technique from first principles, without relying on pre built filtering libraries. The goal was not just to produce a convincing visual trick, but to understand exactly how the trick is constructed at the pixel level.

This project was implemented in Python (NumPy), with a focus on writing the core filtering operations manually and treating filtering as a set of composable building blocks rather than a black box.

Overview

The pipeline treats an image as a blend of coarse structure and fine detail.

One image contributes the foundation, meaning large shapes, smooth shading, and silhouette that remains recognizable from afar. The other contributes the spark, meaning sharp edges and texture that grabs your attention when you are close enough to resolve it. The final hybrid image is the sum of those two carefully prepared components.

These are the two source images used to create the hybrid effect.

Original left image Original Left Image Original right image Original Right Image

Implementation

The filtering stack was built incrementally so each function supported the next.

I started with 2D cross correlation, implemented as the direct operation of sliding a kernel over an image and computing weighted sums. From there, I implemented 2D convolution by flipping the kernel and reusing the same machinery. Once those were stable, I wrote a 2D Gaussian blur kernel generator, which became the core primitive for building frequency style filters in the spatial domain.

With Gaussian blur in place, the filters become conceptually clean:

  • A low pass filter blurs the image and keeps the smooth version
  • A high pass filter subtracts the blurred image from the original, leaving only fine detail

The hybrid is constructed by combining a low pass version of one image with a high pass version of the other.

Low-pass filtered image Low Pass Filtered Image: blurred version that preserves coarse features High-pass filtered image High Pass Filtered Image: edge and detail information only

Results

The final hybrid depends heavily on tuning, because you are blending perceptual priorities rather than simply mixing pixels. The parameters I used were:

  • right_size: 8.0
  • left_size: 13.0
  • right_sigma: 4.1
  • left_sigma: 7.0
  • mixin_ratio: 0.65
  • scale_factor: 2.0

These settings control how much structure survives in the low pass image, how sharp the extracted high pass details feel, and how strongly one image dominates when the two compete. The mixin ratio became the balance knob, since it governs how assertively the high frequency details sit on top of the low frequency foundation.

The effect becomes more apparent when viewing the hybrid at different scales. Full size emphasizes the close view identity, while smaller versions simulate stepping back and allow the low frequency image to take over.

Hybrid image full size Full Size
Hybrid image half size Half Size
Hybrid image quarter size Quarter Size

Discussion

Implementing convolution and cross correlation from scratch made filtering feel tangible and concrete. Frequency stopped being an abstract idea and became something I could control through sigma and kernel design. The project also reinforced that parameter tuning is perceptual as much as it is technical, since small changes in blur strength can flip which image the viewer believes.

Hybrid images are a small project with an outsized lesson: images contain layers of information at different scales, and perception decides which layer counts. Building the pipeline from scratch made that idea concrete and manipulable. The final result is a reminder that a single picture can hold multiple interpretations, and distance is what reveals them.

Technologies Used:

Python
NumPy
Computer Vision
Image Processing
Signal Processing