CS184/284A Final Report

Abstract

In this project we extended our HW3 path tracer with three key additions:

We implemented advanced BSDFs for mirror, glass, and microfacet (conductor), enabling sharp specular and realistic metallic looks.
We improved direct-light estimation with Multiple Importance Sampling (MIS): we added a BSDF-sampling branch and blended it with explicit light sampling via the two-sample power heuristic. We also included a delta-aware tweak for specular cases.
Adaptive sampling with a ViT saliency prior. We precomputed a CLIP/ViT saliency heatmap and biased per-pixel budgets and convergence thresholds so that high-saliency regions (faces, silhouettes, highlights) got more samples and tighter confidence intervals, while low-saliency regions (flat walls, empty background) stopped earlier. This played nicely with MIS + advanced BSDFs: caustics and microfacet highlights retained quality, but larger uniform areas converged quickly. We reported speed/quality on two scenes: CBspheres_tex (glass + metal spheres in a Cornell box) and CBdragon_microfacet_au (gold dragon).

1. Technical Approach

BSDF Functions:
- Mirror BSDF (Delta Reflection)
  Goal. Ideal specular reflection with stable numerics and clean MIS interaction.
  
  Evaluation. Return non-zero only if $ \omega_i $ equals the perfect reflection of $ \omega_o $ (within tolerance $ \sim 10^{-3} $).
  
  Sampling. Deterministic: $ \omega_i = \mathrm{reflect}(\omega_o) $, pdf = 1.
  
  Key lines (pseudo):
  - reflect(wo, wi); // $ \omega_i = -\omega_o + 2\,(\omega_o\cdot n)\,n $
  - if (norm(wi - reflect(wo)) < tol) return rho_r;
  - Use EPS origin offset to avoid self-intersection.
  MIS note. Because the lobe is delta, light sampling almost never proposes the specular direction; in direct lighting, bias toward the BSDF branch (e.g., 0.8/0.2) for reduced fireflies.
- Glass BSDF (Specular R + T, Exact Fresnel)
  Goal. Ideal specular reflection + refraction with accurate dielectric Fresnel and energy-correct throughput.
  
  Fresnel (dielectric). Compute $ F = \tfrac{1}{2}(R_{\parallel}^2 + R_{\perp}^2) $ with $ \sin\theta_t = (\eta_i/\eta_t) \sin\theta_i $; on TIR set $ F=1 $.
  
  Sampling (mixture of deltas).
  - With prob. $ F $: reflection, $ \omega_i = \mathrm{reflect}(\omega_o) $, pdf = F, contribution $ \rho_r F $.
  - With prob. $ 1-F $: refraction, compute Snell’s $ \omega_t $ robustly, pdf = 1-F, contribution $ \rho_t (\eta_i/\eta_t)^2 (1-F) / |\cos\theta_t| $.
  Evaluation path. Only the exact R or T directions return non-zero (tolerance $ \sim 10^{-3} $).
  
  Key lines (pseudo):
  - double F = fresnel_dielectric(abs(wo.z), eta_i, eta_t);
  - if (coin_flip(F)) { wi = reflect(wo); pdf = F; return rho_r * F; }
  - if (!refract(wo, &wt, ior)) { wi = reflect(wo); pdf = 1; return rho_r; }
  - wi = wt; pdf = 1 - F; return rho_t * (eta*eta) * (1 - F) / abs_cos_theta(wi);
  We also use EPS offsets, and in lighting estimators multiply by $ |\cos\theta| $ (not $ \max(0,\cdot) $) so refracted directions contribute correctly.
- Microfacet (Conductor) — Beckmann NDF, Exact $ (\eta,k) $ Fresnel
  Evaluation. For $ \omega_o\!\cdot n>0, \omega_i\!\cdot n>0 $, $$f = \frac{D(\mathbf{h})\,F(\omega_i\!\cdot\!\mathbf{h})\,G}{4\,\cos\theta_i\,\cos\theta_o},\quad D(\mathbf{h}) = \frac{e^{-\tan^2\theta_h/\alpha^2}}{\pi \alpha^2 \cos^4\theta_h}.$$ $ F $ uses exact conductor Fresnel with $ (\eta,k) $ per color channel; $ G $ uses a Smith-style clamp.
  
  Sampling (half-vector). Sample $ \mathbf{h} $ from Beckmann (invert CDF for $ \theta_h $, uniform $ \phi $), reflect $ \omega_o $ about $ \mathbf{h} $, and convert pdf via Jacobian $ 1/(4\,\omega_o\!\cdot\!\mathbf{h}) $.
  
  Key lines (pseudo):
  - sample h ~ Beckmann(alpha); wi = reflect(wo, h); if (wi.z <= 0) reject;
  - pdf(wi) = (D(h) * cos(theta_h)) / (4 * dot(wo, h));
  - return (D * F * G) / (4 * cos_i * cos_o);
Multiple Importance Sampling:
Algorithm: Multiple Importance Sampling (MIS) for Direct Lighting

Goal. Combine explicit light sampling and BSDF sampling to estimate direct illumination with lower variance, including robust handling of delta lights and delta BSDFs.

Inputs. Hit point $\mathbf{x}$, shading frame $(o2w,w2o)$, outgoing direction $\omega_o$ (local), scene lights $\{\mathcal{L}\}$, material BSDF at $\mathbf{x}$.
Outputs. Direct-radiance estimate $L_{\text{dir}}(\mathbf{x},\omega_o)$.

Estimators
1. Light-sampling estimator. For each light $\mathcal{L}$, draw $n_\mathcal{L}$ samples: $$\omega_i \sim p_{\text{light}}(\cdot),\quad (\mathbf{x}\!\to\!\mathbf{y})\ \text{unoccluded}.$$ Per-sample contribution: $$\widehat L_{\text{light}}=\frac{f(\omega_o,\omega_i)\,L_i(\omega_i)\,|\cos\theta_i|}{p_{\text{light}}(\omega_i)}.$$ Use a shadow ray with min_t = EPS and max_t = distToLight - EPS; set $n_\mathcal{L}=1$ for delta lights, otherwise $n_\mathcal{L}=\texttt{ns\_area\_light}$ (optionally boosted for delta BSDF surfaces).
2. BSDF-sampling estimator. Draw one BSDF direction: $$\omega_i \sim p_{\text{bsdf}}(\cdot)=\text{pdf returned by }\texttt{bsdf.sample\_f},$$ trace a shadow ray; if it hits emissive geometry, accumulate: $$\widehat L_{\text{bsdf}}=\frac{f(\omega_o,\omega_i)\,L_i(\omega_i)\,|\cos\theta_i|}{p_{\text{bsdf}}(\omega_i)}.$$
Two-Estimator MIS Blend

Let the branch PDFs be $p_{\text{light}}$ and $p_{\text{bsdf}}$. We use the two-sample power heuristic ($\beta=2$):

$$w_{\text{light}}=\frac{p_{\text{light}}^2}{p_{\text{light}}^2+p_{\text{bsdf}}^2},\qquad w_{\text{bsdf}}=\frac{p_{\text{bsdf}}^2}{p_{\text{light}}^2+p_{\text{bsdf}}^2}.$$

The final estimator is: $$L_{\text{dir}} = w_{\text{light}}\ \widehat L_{\text{light}} + w_{\text{bsdf}}\ \widehat L_{\text{bsdf}}.$$ When light-sampling uses multiple samples per light we keep a running average and an average light PDF to stabilize weights. If one branch is unavailable (e.g., zero pdf or no hit), we return the other branch.

Delta-Aware Refinement
- Delta lights: Use $n_\mathcal{L}=1$ and rely on light sampling.
- Delta BSDFs (mirror/glass): Light sampling almost never proposes the specular direction. We bias the blend toward the BSDF branch with a convex mix when the surface is delta (e.g., weight_bsdf = 0.8, weight_light = 0.2), while keeping the power heuristic for non-delta materials: $$L_{\text{dir}}=\begin{cases} 0.2\,\widehat L_{\text{light}}+0.8\,\widehat L_{\text{bsdf}}, & \text{delta BSDF}\\[2pt] w_{\text{light}}\,\widehat L_{\text{light}} + w_{\text{bsdf}}\,\widehat L_{\text{bsdf}}, & \text{otherwise}. \end{cases}$$
Our Implementation Notes (What We Changed)
- We added BSDF sampling inside estimate_direct_lighting_importance to capture glossy/specular contributions that the light sampler misses.
- We defined branch PDFs for both BSDF and light sampling. For light sampling we computed an average light PDF over visible samples (normalized by the count) to stabilize weights. For delta BSDF surfaces we used the convex weights above to suppress fireflies.
- Finally, we implemented the power heuristic to combine branches based on their PDFs:
  double denom = pdf_bsdf*pdf_bsdf + direct_light_avg_pdf*direct_light_avg_pdf;
  return L_light * (direct_light_avg_pdf*direct_light_avg_pdf)/denom
  + L_bsdf * (pdf_bsdf*pdf_bsdf)/denom;
ML Guided Adaptive Sampling (ViT Saliency)
- Prior. A ViT/CLIP saliency heatmap $ s(x,y)\in[0,1] $ (resized to framebuffer) indicates perceptual importance.
- Budget map. Per-pixel spp cap: $ n_{\max}(x,y) = \mathrm{clamp}(\text{base\_spp} \cdot [a + b\,s],\, n_{\min},\, n_{\max}) $.
- Convergence rule. CI test uses a saliency-scaled tolerance: $ I \le \tau(s)\,\mu $, with $ \tau(s) = \tau_{\text{lo}}(1-s)+\tau_{\text{hi}}s $ (high saliency ⇒ tighter tolerance).
- Safety rails. We keep a global hard floor on spp and stop early if CI is already tight (prevents oversampling empty regions even if the prior is wrong).
- We interpret maxTolerance = 0.05 as a relative 5% CI half-width.

2. Challenges

Challenge 1: Debugging the Glass BSDF — From Black Patches to Correct Transmission

Issue. The initial glass implementation produced black spots, inverted shadows, and reduced transparency.

Cause. In several lighting estimators, the geometric term was computed as max(0, z), which incorrectly zeroed valid refracted paths. For transmission, the correct form is $|\cos\theta|$, allowing directions across hemispheres.

Fix. Replaced all std::max(0.0, ...z) with fabs(...z) in:

Uniform-hemisphere direct lighting
Light-sampling branch (NEE)
BSDF-sampling branch for direct hits
Recursive bounce in global illumination

Why it works. The rendering equation’s geometry term uses $|\cos\theta|$, not a clamped cosine. Clamping removes valid transmission contributions, causing energy loss and artifacts.

Additional stability measures. Exact dielectric Fresnel with TIR fallback, $(\eta_i / \eta_t)^2$ energy scaling, delta-lobe tolerance $10^{-3}$, EPS-offset shadow rays, and delta-aware MIS weighting.

Result. The black spot and ghosting disappeared, transparency was restored, and no new fireflies appeared.

Challenge 2: Handling Delta Distributions in MIS to Reduce Fireflies

One challenge was managing delta-distribution BSDFs (such as perfect specular reflection and refraction) during Multiple Importance Sampling (MIS). In the initial implementation, assigning equal weights to light sampling and BSDF sampling sometimes caused severe fireflies, especially on reflective glass surfaces. To address this, I adjusted the MIS weight calculation when isect.bsdf->is_delta() is true, biasing the combination towards BSDF sampling:

This change increases the BSDF sampling weight to 0.8 and reduces light sampling to 0.2. It significantly reduced noise and improved highlight stability on mirror and glass surfaces without introducing noticeable bias in the final rendering.

Challenge 3: Using an Imperfect Saliency Map Without Hurting Adaptive Sampling

Our ViT-based saliency is imperfect: it can miss specular effects (caustics, glossy lobes), over-mark flat backgrounds, or shift under different FOVs. The challenge was to extract useful guidance from this noisy prior while still letting adaptive sampling (variance-driven) do its job. A naïve approach (e.g., fully trusting the mask) either oversamples empty regions or starves hard transport like caustics.

What we tried & where we landed. During development, we experimented with extra checks for caustics and low-variance background suppression, but found they introduced too much processing overhead to justify the gains. In the end, we kept a streamlined version where the saliency mask biases the initial per-pixel sample allocation, and the adaptive sampler’s variance-based stopping criteria takes over naturally.

Floor & Ceiling: Per-pixel SPP is clamped to [min_spp, max_spp] so the prior can’t zero-out hard pixels or blow up easy ones.
Annealing: Start more prior-biased in the first few batches, then gradually hand control back to variance so the sampler can “correct” a wrong prior.
CI Gating: If a pixel’s confidence interval is already tight, we stop sampling regardless of the prior’s value.

Policy sketch (pseudo):


            // Inputs: saliency s in [0,1], variance stats (mu, sigma), CI half-width I,
            //         global caps: min_spp, max_spp
            int budget_from_prior = lerp(min_spp, max_spp, s);
            
            // Anneal prior influence over time
            budget_from_prior = mix(budget_from_prior, ns_aa, anneal_factor);
            
            // CI gating: stop early if converged
            while (n < budget_from_prior) {
                take_batch();
                if (n >= 2 && I <= tol * mu) break; // variance wins
            }

Outcome. This simpler integration preserved most of the performance benefit without slowing down the render loop. In our tests, it consistently reduced wasted samples on low-detail areas while maintaining image quality for challenging regions like glass and metal.


// For delta-distribution materials, adjust MIS weights to improve reflection results
if (isect.bsdf->is_delta()) {
    // Increase BSDF sampling weight, reduce light sampling to mitigate fireflies
    double weight_bsdf = 0.8;
    double weight_light = 0.2;
    return L_out_lighting_sample * weight_light + L_out_bsdf_sample * weight_bsdf;
}

3. Results

Rendering of microfacet surface CBdragon with no Microfacet BSDF implemented.	Rendering of CBspheres with no advanced BSDFs implemented.
CBdragon_microfacet_au with Microfacet BSDF implemented without MIS adaptive sampling.	CBdragon_microfacet_au with Microfacet BSDF implemented and with MIS adaptive sampling.
CBspheres with Glass BSDF implemented without MIS adaptive sampling.	CBspheres with Glass BSDF implemented and with MIS adaptive sampling.
CBspheres_microfacet_al_ag with Microfacet BSDF implemented without MIS adaptive sampling.	CBspheres_microfacet_al_ag with Microfacet BSDF implemented and with MIS adaptive sampling.

Saliency-Biased Adaptive Sampling (SBAS) — 4096 spp Results

We compare baseline adaptive sampling (with MIS) against our ViT Saliency-Biased Adaptive Sampling (SBAS). For each scene we show: the rendered image, per-pixel sampling rate heatmap (_rate), and the saliency map & mask used to guide sampling.

CBdragon — Gold Microfacet Dragon in a Cornell Box (4096 spp)

Expect SBAS to concentrate samples on high-frequency geometry (spines, face) and bright specular regions, while de-emphasizing flat background.

Baseline (adaptive + MIS)	Baseline sampling rate visualization
SBAS (ViT-guided) render	SBAS sampling rate visualization
Saliency map (continuous)	Saliency mask (thresholded)

Timing & quality (fill these after measuring)

Baseline time (4096 spp): 3693.3846 s
SBAS time (4096 spp): 2197.6018 + 40 s (165.06% speed-up)
Notes: SBAS allocates more samples to the head/spines and bright specular ridges; background receives fewer.

CBspheres — Glass & Metal Spheres in Cornell Box (4096 spp)

SBAS emphasizes glass caustics and metal highlights while reducing effort on walls/ceiling. Useful for testing specular transport + MIS.

Baseline (adaptive + MIS)	Baseline sampling rate visualization
SBAS (ViT-guided) render	SBAS sampling rate visualization
Saliency map	Saliency mask

Timing & quality (fill these after measuring)

Baseline time (4096 spp): 1122.1510 s
SBAS time (4096 spp): 443.9791 + 40 s (231.86% speed-up)
Notes: SBAS increases samples under the spheres (caustics) and on specular highlights; reduces on planar walls.

4. References

Tools & Frameworks:

C++ for path tracer extensions
PyTorch / OpenCV for saliency maps
HTML/CSS for webpage

5. Team-member Contributions (alphabetical)

Eduardo Cortes: Led development of the ML-guided adaptive sampling implementation / write-up.
Henry Michaelson: Initial (buggy) implementation of advanced BSDFs / MIS; structure and templates.
Yuhe Qin: Fixed BSDF implementations for glass; added assets; debugging.
Zhehao Yang: Fixed mirror & microfacet BSDFs and MIS bugs.

Final Report: MIS & ML-guided Adaptive Sampling

AI Acknowledgement

Project Video