The goal of this homework is to help you better understand the statistical properties and computational challenges of local smoothing such as loess, Nadaraya-Watson (NW) kernel smoothing, and spline smoothing.

ISyE 7406: Data Mining & Statistical Learning HW#4 Local Smoothing in R. The goal of this homework is to help you better understand the statistical properties and computational challenges of local smoothing such as loess, Nadaraya-Watson (NW) kernel smoothing, and spline smoothing. For this purpose, we will compute empirical bias and empirical variances based on m = 1000 Monte Carlo runs, where in each run we simulate a data set of n = 101 observations from the additive model Yi = f(xi) + i (1) with the famous Mexican hat function f(x) = (1 – x²) exp(-0.5x²), -2p = x = 2p, (2) and 1, · · · , n are independent and identically distributed (iid) N(0, 0.2²). This function is known to pose a variety of estimation challenges, and below we explore the difficulties inherent in this function. (1) Let us first consider the (deterministic fixed) design with equi-distant points in [-2p, 2p]. (a) For each of m = 1000 Monte Carlo runs, simulate or generate a data set of the form (xi, Yi) with xi = 2p(-1 + 2(i-1)/(n-1)) and Yi is from the model in (1). Denote such data set as Dj at the j-th Monte Carlo run for j = 1, · · · , m = 1000. (b) For each data set Dj or each Monte Carlo run, compute the three different kinds of local smoothing estimates at every point in Dj: loess (with span = 0.75), Nadaraya-Watson (NW) kernel smoothing with Gaussian Kernel and bandwidth = 0.2, and spline smoothing with the default tuning parameter. (c) At each point xi, for each local smoothing method, based on m = 1000 Monte Carlo runs, compute the empirical bias Bias{f(xi)} and the empirical variance Var{f(xi)}, where Bias{f(xi)} = (1/m) S (from j=1 to m) ˆf(j)(xi) – f(xi), Var{f(xi)} = (1/(m – 1)) S (from j=1 to m) ˆf(j)(xi) – f(xi)². (d) Plot these quantities against xi for all three kinds of local smoothing estimators: loess, NW kernel, and spline smoothing. (e) Provide a thorough analysis of what the plots suggest, e.g., which method is better/worse on bias, variance, and mean square error (MSE)? Do think whether it is a fair comparison between these three methods? Why or why not? (2) Repeat part (1) with another (deterministic) design that has non-equidistant points. The following R code can be used to generate the design points xi’s (you can keep the xi’s fixed in the m = 1000 Monte Carlo runs): 1 set.seed(79) x

The Correct Answer and Explanation is:

✅ R Code: Monte Carlo Simulation & Local Smoothing (Equidistant Design)

# Load required libraries
library(stats)
library(KernSmooth)
library(splines)
library(ggplot2)

# Set parameters
set.seed(123)
m <- 1000
n <- 101
x <- seq(-2*pi, 2*pi, length.out = n)
f_true <- function(x) (1 - x^2) * exp(-0.5 * x^2)
fx <- f_true(x)
sigma <- 0.2

# Storage matrices
fhat_loess <- matrix(0, n, m)
fhat_nw <- matrix(0, n, m)
fhat_spline <- matrix(0, n, m)

# Monte Carlo runs
for (j in 1:m) {
  eps <- rnorm(n, 0, sigma)
  y <- fx + eps
  # Loess
  fit_loess <- loess(y ~ x, span = 0.75)
  fhat_loess[, j] <- predict(fit_loess, x)
  # NW kernel smoothing
  fhat_nw[, j] <- ksmooth(x, y, kernel = "normal", bandwidth = 0.2, x.points = x)$y
  # Spline smoothing
  fit_spline <- smooth.spline(x, y)
  fhat_spline[, j] <- predict(fit_spline, x)$y
}

# Compute bias and variance
bias_loess <- rowMeans(fhat_loess) - fx
var_loess <- apply(fhat_loess, 1, var)

bias_nw <- rowMeans(fhat_nw) - fx
var_nw <- apply(fhat_nw, 1, var)

bias_spline <- rowMeans(fhat_spline) - fx
var_spline <- apply(fhat_spline, 1, var)

# Plot
par(mfrow = c(2, 1))
plot(x, bias_loess, type = 'l', col = "blue", ylim = range(c(bias_loess, bias_nw, bias_spline)), ylab = "Bias", main = "Empirical Bias")
lines(x, bias_nw, col = "red")
lines(x, bias_spline, col = "green")
legend("topright", legend = c("Loess", "NW", "Spline"), col = c("blue", "red", "green"), lty = 1)

plot(x, var_loess, type = 'l', col = "blue", ylim = range(c(var_loess, var_nw, var_spline)), ylab = "Variance", main = "Empirical Variance")
lines(x, var_nw, col = "red")
lines(x, var_spline, col = "green")
legend("topright", legend = c("Loess", "NW", "Spline"), col = c("blue", "red", "green"), lty = 1)

Analysis:

The empirical bias and variance plots for loess, NW kernel, and spline smoothing provide a comprehensive view of the statistical behavior of each smoother under the additive noise model. All three methods aim to estimate the Mexican hat function, which is nontrivial due to its multiple inflection points and tails.

Bias:
Spline smoothing generally exhibits the lowest bias across the range, particularly near the peaks and troughs of the function. Loess has moderate bias, while NW kernel smoothing tends to have higher bias, especially near boundaries and areas where the function changes rapidly. This is because kernel smoothing struggles to adapt to local curvature due to its fixed bandwidth.

Variance:
NW kernel smoothing demonstrates the lowest variance among the three, since its bias-variance tradeoff favors stability. Loess has a higher variance, particularly in areas where fewer neighboring points influence the local fit. Spline smoothing, while achieving low bias, suffers from higher variance in the tails due to overfitting flexibility.

Conclusion:
Spline smoothing performs best in bias reduction but at the cost of higher variance, making it suitable where bias is a bigger concern. NW is stable but inaccurate in structure-dense regions. Loess strikes a balance, making it favorable in moderately complex settings.

Fairness of Comparison:
The comparison assumes default tuning parameters, which may favor some methods over others. A fairer comparison would optimize bandwidth or smoothing parameters using cross-validation for each method. Still, the analysis is instructive for understanding default behavior and bias-variance tradeoffs.

Scroll to Top