The goal of this homework is to help you better understand the statistical properties and computational challenges of local smoothing such as loess, Nadaraya-Watson (NW) kernel smoothing, and spline smoothing.

ISyE 7406: Data Mining & Statistical Learning HW#4 Local Smoothing in R. The goal of this homework is to help you better understand the statistical properties and computational challenges of local smoothing such as loess, Nadaraya-Watson (NW) kernel smoothing, and spline smoothing. For this purpose, we will compute empirical bias and empirical variances based on m = 1000 Monte Carlo runs, where in each run we simulate a data set of n = 101 observations from the additive model Yi = f(xi) + i (1) with the famous Mexican hat function f(x) = (1 – x²) exp(-0.5x²), -2p = x = 2p, (2) and 1, · · · , n are independent and identically distributed (iid) N(0, 0.2²). This function is known to pose a variety of estimation challenges, and below we explore the difficulties inherent in this function. (1) Let us first consider the (deterministic fixed) design with equi-distant points in [-2p, 2p]. (a) For each of m = 1000 Monte Carlo runs, simulate or generate a data set of the form (xi, Yi) with xi = 2p(-1 + 2(i-1)/(n-1)) and Yi is from the model in (1). Denote such data set as Dj at the j-th Monte Carlo run for j = 1, · · · , m = 1000. (b) For each data set Dj or each Monte Carlo run, compute the three different kinds of local smoothing estimates at every point in Dj: loess (with span = 0.75), Nadaraya-Watson (NW) kernel smoothing with Gaussian Kernel and bandwidth = 0.2, and spline smoothing with the default tuning parameter. (c) At each point xi, for each local smoothing method, based on m = 1000 Monte Carlo runs, compute the empirical bias Bias{f(xi)} and the empirical variance Var{f(xi)}, where Bias{f(xi)} = (1/m) S (from j=1 to m) ˆf(j)(xi) – f(xi), Var{f(xi)} = (1/(m – 1)) S (from j=1 to m) ˆf(j)(xi) – f(xi)². (d) Plot these quantities against xi for all three kinds of local smoothing estimators: loess, NW kernel, and spline smoothing. (e) Provide a thorough analysis of what the plots suggest, e.g., which method is better/worse on bias, variance, and mean square error (MSE)? Do think whether it is a fair comparison between these three methods? Why or why not? (2) Repeat part (1) with another (deterministic) design that has non-equidistant points. The following R code can be used to generate the design points xi’s (you can keep the xi’s fixed in the m = 1000 Monte Carlo runs): 1 set.seed(79) x

The Correct Answer and Explanation is:


✅ R Code: Local Smoothing Bias and Variance Analysis (Equidistant Design)

set.seed(123)
library(stats)
library(KernSmooth)
library(splines)

# Parameters
m <- 1000
n <- 101
sigma <- 0.2
x_seq <- seq(-2 * pi, 2 * pi, length.out = n)
f_true <- function(x) (1 - x^2) * exp(-0.5 * x^2)
fx <- f_true(x_seq)

# Initialize storage
loess_est <- matrix(0, nrow = m, ncol = n)
nw_est <- matrix(0, nrow = m, ncol = n)
spline_est <- matrix(0, nrow = m, ncol = n)

# Monte Carlo simulations
for (j in 1:m) {
  epsilon <- rnorm(n, 0, sigma)
  y <- fx + epsilon
  
  # Loess
  loess_fit <- loess(y ~ x_seq, span = 0.75)
  loess_est[j, ] <- predict(loess_fit, x_seq)
  
  # NW kernel
  nw_fit <- ksmooth(x_seq, y, kernel = "normal", bandwidth = 0.2, x.points = x_seq)
  nw_est[j, ] <- nw_fit$y
  
  # Spline smoothing
  spline_fit <- smooth.spline(x_seq, y)
  spline_est[j, ] <- predict(spline_fit, x_seq)$y
}

# Compute bias and variance
bias <- function(est_matrix) colMeans(est_matrix) - fx
variance <- function(est_matrix) apply(est_matrix, 2, var)

bias_loess <- bias(loess_est)
bias_nw <- bias(nw_est)
bias_spline <- bias(spline_est)

var_loess <- variance(loess_est)
var_nw <- variance(nw_est)
var_spline <- variance(spline_est)

# Plotting
par(mfrow = c(3, 1), mar = c(4, 4, 2, 1))
plot(x_seq, bias_loess, type = "l", col = "blue", ylim = range(bias_loess, bias_nw, bias_spline), ylab = "Bias", main = "Empirical Bias")
lines(x_seq, bias_nw, col = "red")
lines(x_seq, bias_spline, col = "green")
legend("topright", legend = c("Loess", "NW", "Spline"), col = c("blue", "red", "green"), lty = 1)

plot(x_seq, var_loess, type = "l", col = "blue", ylim = range(var_loess, var_nw, var_spline), ylab = "Variance", main = "Empirical Variance")
lines(x_seq, var_nw, col = "red")
lines(x_seq, var_spline, col = "green")

mse_loess <- bias_loess^2 + var_loess
mse_nw <- bias_nw^2 + var_nw
mse_spline <- bias_spline^2 + var_spline
plot(x_seq, mse_loess, type = "l", col = "blue", ylim = range(mse_loess, mse_nw, mse_spline), ylab = "MSE", main = "Mean Squared Error")
lines(x_seq, mse_nw, col = "red")
lines(x_seq, mse_spline, col = "green")

Explanation

This experiment investigates the bias and variance of three local smoothing techniques—Loess, Nadaraya-Watson (NW) kernel smoothing, and spline smoothing—through 1000 Monte Carlo simulations based on a fixed equidistant design.

Loess Smoothing: Loess generally has moderate bias and low-to-moderate variance. Its performance is balanced across the domain. Due to the span of 0.75, the smoother adapts well to the general shape of the “Mexican hat” function, though it may struggle slightly with the steep regions.

NW Kernel Smoothing: The NW estimator with Gaussian kernel and small bandwidth (0.2) is highly localized, leading to low bias in high-curvature regions but higher variance due to reduced smoothing. Near the edges, its performance deteriorates (boundary bias), increasing both bias and variance.

Spline Smoothing: Spline smoothing shows very low variance due to its global fit nature, but tends to be biased, especially in regions with high curvature or inflection points. Since it uses global smoothing, it’s less adaptive to localized features.

MSE (Mean Squared Error): Loess and spline estimators often outperform NW in MSE due to the variance-bias tradeoff. Loess usually provides the best balance, while NW is optimal only in regions requiring localized fitting but suffers from high variance. Spline, though biased, remains stable across the domain.

Fairness of Comparison: The comparison is mostly fair under the fixed design and equal noise level. However, each method has different tuning parameters (e.g., bandwidth vs. span), and their default settings are not necessarily optimal for this function, which slightly biases the comparison.

In conclusion, Loess emerges as the best all-around smoother for this setup, balancing bias and variance effectively.

Scroll to Top