import numpy as np
from collections import Counter
def euclidean_distance(a, b):
return np.sqrt(np.sum((a - b) ** 2))
def knn_predict(X_train, y_train, x_query, k):
# Compute distances from the query point to all training points
= [euclidean_distance(x_query, x_train) for x_train in X_train]
distances # sort distances and get indices of k nearest neighbors
= np.argsort(distances)[:k]
k_indices # Extract the labels of the k nearest neighbors
= [y_train[i] for i in k_indices]
k_nearest_labels # Return the most common label among the k nearest neighbors
return Counter(k_nearest_labels).most_common(1)[0][0]
Introduction
Pitch tunneling is the art of deception. It’s a pitcher’s ability to make multiple, distinct pitches look identical as they leave the hand. From the batter’s perspective, two pitches can travel through the same “tunnel” partway to the plate before diverging. What looked like a hittable fastball suddenly drops off the table - a devastating curveball for a swinging strike. Effective tunneling neutralizes a batter’s ability to recognize a pitch, forcing a split-second, and often wrong, decision.
Here’s a clip by @PitchingNinja to illustrate pitch tunneling:Tyler Glasnow, 99mph Two Seamer and 84mph Curveball, Overlay pic.twitter.com/7tq1KcoqKj
— Rob Friedman ((PitchingNinja?)) July 19, 2025
Notice how both pitches erupt from the exact same arm slot and follow the same initial trajectory. The batter sees the same initial trajectory for both pitches, making it difficult to distinguish between the two until it’s too late.
In this article, we are going to quantify how well a pitcher tunnels certain pitches. This project is inspired by Baseball Savant’s page on arm angle. While resources like Baseball Savant provide ample data on release points and arm angles, a dedicated metric for tunneling effectiveness is surprisingly absent. This project aims to fill that gap. In order to quantify pitch tunneling, we will use K-Nearest Neighbors (K-NN) to find similar pitches in terms of release point and movement. We can then see what percentage of those similar pitches are of the same pitch type. A high percentage indicates good tunneling ability, while a low percentage suggests that the pitcher is telegraphing their pitches.
K-Nearest Neighbor
We turn to K-Nearest Neighbor (K-NN), a non-parametric machine learning algorithm used for classification and regression tasks (Wikipedia contributors (2025)). Other methods like Logistic Regression or Support Vector Machines usually have a training phase that involves stochastic gradient descent to find optimal parameters. K-NN’s training phase is much simpler. In K-NN, the training phase stores the feature vectors and class labels in memory - that’s it. So the training phase takes no time to complete. This implies that K-NN works best with smaller amounts of data. In general, K-NN works well on low-dimensional and low-volume data.
Professor Fei Fei Li over at Stanford has a good lecture on when NOT to use K-NN.
The bulk of K-NN happens in the inference phase. Given some positive integer, k, K-NN assigns a label to a query (x_i) based on the most frequent k training samples nearby x_i. K-NN commonly uses Euclidean distance to measure nearness, but other distance metrics like Manhattan distance or Minkowski distance can also be used. Here’s a rundown of the three distance metrics (“K-Nearest Neighbor(KNN) Algorithm - GeeksforGeeks” (n.d.)). \text{Euclidean Distance}(x_i, y_i) = \sqrt{\sum_{j=1}^{n} (x_{i} - y_{i})^2}
\text{Manhattan Distance}(x_{i}, y_{i}) = \sum_{j=1}^{n} |x_{i} - y_{i}|
\text{Minkowski Distance}(x_{i}, y_{i}) = \left( \sum_{j=1}^{n} |x_{i} - y_{i}|^p \right)^{1/p} Note that when p=2, Minkowski distance is equivalent to Euclidean distance, and when p=1, it is equivalent to Manhattan distance.
Since there is no training phase, we just need to implement the inference phase. This is done using a brute-force approach. For each query point, we will compute the distance to all training points and select the k nearest neighbors.
Formalizing K-NN
Suppose we have a training dataset of size n, where each data point x has a corresponding label y. That is, (X_1, Y_1), (X_2, Y_2), \ldots, (X_n, Y_n) maps to values in \mathbb{R}^d\times \{1,2\}, there d is the number of features and \{1,2\} denotes a binary label. X | Y = r \sim P_r for r \in \{1,2\}. In other words, a data point X given its label Y=r follows some distribution P_r. Given a query point x and some distance metric, we reorder the training data based on distance to x:
\| X_{(1)} - x \| \leq \| X_{(2)} - x \| \leq \ldots \leq \| X_{(n)} - x \| where X_{(i)} is the i-th nearest neighbor to x. Hence, (X_1, Y_1), (X_2, Y_2), \ldots, (X_n, Y_n) is reordered. (X_1, Y_1) is the closest neighbor to x, (X_2, Y_2) is the second closest, and so on.
This setup allows us to define the K-NN classifier. The K-NN classifier assigns a label to x based on the most frequent label among its k nearest neighbors.
- Choose a value for k (a positive integer).
- Look at the k nearest neighbors of x: (X_{(1)}, Y_{(1)}), (X_{(2)}, Y_{(2)}), \ldots, (X_{(k)}, Y_{(k)}).
- Assign to x the label that is most common among these k neighbors.
Implementation
K-NN can easily be implemented using NumPy
and built-in Pythohn libraries.
Let’s visualize K-NN with a simple 2D dataset.
Code
# import blobs
from sklearn.datasets import make_blobs
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
="whitegrid", palette="pastel")
sns.set_theme(style
# Create a synthetic dataset
= make_blobs(n_samples=100, centers=2, random_state=47, center_box=(-1.5, 1.5))
X, y
# Define k values to test
= [2, 10]
k_values = []
plot_data
# Apply K-NN for each k
for k in k_values:
= [knn_predict(X, y, x_query, k) for x_query in X]
predictions # Store results for plotting
for i in range(len(X)):
plot_data.append({'feature1': X[i, 0],
'feature2': X[i, 1],
'prediction': predictions[i],
'k': k
})
# Convert to DataFrame
= pd.DataFrame(plot_data)
df
# Create a facet grid
= sns.FacetGrid(df, col="k", hue="prediction", height=3)
g map(sns.scatterplot, "feature1", "feature2", edgecolor='k')
g.
g.add_legend()"Feature 1", "Feature 2")
g.set_axis_labels("k = {col_name}")
g.set_titles( plt.show()

Observe how the decision boundary changes with different values of k. A smaller k (e.g., k=2) leads to a more complex decision boundary that closely follows the training data, while a larger k (e.g., k=10) results in a smoother boundary that generalizes better.
Enough math. Let’s ball.
Pitch Tunneling with K-NN
For the first time in this blog, we will use PyBaseball to fetch data. PyBaseball is a Python package that provides an easy way to access baseball data from various sources, including MLB’s Statcast and Baseball Savant. We will first examine pitcher, Joe Ryan. According to baseball savant, Ryan currently has the highest fastball run value in 2025.
Let’s first query the data for Joe Ryan.
Now that we have the data, let’s preprocess it to extract the relevant features and labels. We will focus on the release_pos_x
, release_pos_z
, and release_extension
columns, as well as the pitch_type
column. In addition, we will normalize the data to ensure that the features are on the same scale.
Magic time. The key step for using K-NN in this application is to not classify the pitches, but rather to find the nearest neighbors based on the features. We will use the NearestNeighbors
class from scikit-learn
to find the k nearest neighbors for each pitch. Below is a Shiny app that allows you to analyze a pitcher’s tunneling ability using K-NN. By entering the pitcher’s first and last name, my app’s backend will fetch the pitcher’s data and compute the K-NN scores for the selected pitch type. The app will then display a report and a plot of the pitch types and their tunneling ability.
You can view the full app here.
The main limitation to using K-NN for pitch tunneling is that pitch types are evenly distributed. If a pitcher throws a pitch type very infrequently, then the K-NN scores will show a higher percentage of different pitch types. This is because the K-NN algorithm will find more pitches that are not of the same type, simply because there are fewer of that type in the dataset. This is why I set a minimum pitch threshold of 25 pitches for the target pitch type. If a pitcher has fewer than 25 pitches of a certain type, then the K-NN scores will not be reliable. Having said that, there is still merit in comparing the K-NN scores for a player’s main pitch.
To address the issue of uneven pitch distribution, we also provide a Log-Likelihood method to analyze pitch tunneling. This method calculates the likelihood of a pitch being of a certain type given its features, and it can be more robust to uneven distributions. We leverage the scipy.stats
library to compute the log-likelihood of the pitch type given the features and compare it against the most frequently used other pitch (in other words, the secondary pitch). The app allows you to switch between K-NN and Log-Likelihood methods for analysis. The closer the log-likelihood score is to zero, the better the tunneling ability of the pitcher.
The Payoff: Results
Since we listed the top 10 pitchers by fastball run value, let’s analyze the tunneling ability of those pitchers. We refer to the K-NN Deception Score as “K-Score” and the log-likelihood score as “L-Score”.
Pitcher | K-Score | L-Score |
---|---|---|
1. Joe Ryan | 40.86% | -0.32 |
2. Nick Pivetta | 41.58% | -0.34 |
3. Bryan Woo | 48.71% | -0.23 |
4. Hunter Brown | 49.68% | -0.45 |
5. Ryne Nelson | 24.36% | -0.84 |
6. Andrew Abbott | 40.66% | -0.46 |
7. Kevin Gausman | 21.14% | -1.96 |
8. Paul Skenes | 52.06% | -0.37 |
9. Jacob deGrom | 42.84% | -0.38 |
10. Cade Smith | 18.81% | -1.24 |
Average | 38.07% | -0.66 |
Sure, it is a small sample size but the top 10 are consistent with a K-Score of around 40% and an L-Score of around -0.3. The K-Score indicates that around 40% of the nearest neighbors for a pitch are of a different type, while the L-Score indicates that the log-likelihood of the pitch being of a certain type is close to zero, suggesting good tunneling ability.
Let’s see the bottom 10 pitchers by fastball run value.
Pitcher | K-Score | L-Score |
---|---|---|
1. Germán Márquez | 61.52% | -1.62 |
2. Antonio Senzatela | 34.33% | -1.04 |
3. Grant Holmes | 62.81% | -0.11 |
4. Angel Chivilli | 46.16% | -0.35 |
5. Walker Buehler | 65.59% | -0.13 |
6. Chase Dollander | 36.09% | -1.85 |
7. Bradley Blalock | 31.08% | -3.37 |
8. Austin Gomber | 53.94% | -0.77 |
9. Osvaldo Bido | 55.24% | -1.12 |
10. Charlie Morton | 56.79% | -0.72 |
Average | 50.36% | -1.11 |
Observe that this is where accounting for uneven pitch distribution is important. The average K-Score is around 50%, which is higher than the top 10 pitchers. However, the L-Score is around -1.11, which indicates that the log-likelihood of the pitch being of a certain type is further from zero, suggesting poorer tunneling ability. This is sizably lower than the top 10 pitchers, which had an average L-Score of around -0.66.
Let’s also take a look at the top 10 pitchers by slider run value in 2025.
Pitcher | K-Score | L-Score |
---|---|---|
1. Chris Sale | 23.48% | -1.76 |
2. Andrés Muñoz | 32.83% | -0.63 |
3. Steven Okert | 39.92% | -0.21 |
4. Carlos Rodón | 24.77% | -7.70 |
5. Jacob deGrom | 47.87% | -0.43 |
6. Dylan Lee | 30.26% | -1.20 |
7. Ryan Helsley | 48.46% | -0.36 |
8. Grant Holmes | 62.64% | -0.14 |
9. Bennett Sousa | 34.65% | -0.72 |
10. Josh Hader | 21.88% | -3.61 |
Average | 36.68% | -1.68 |
The bottom 10 pitchers by slider run value in 2025.
Pitcher | K-Score | L-Score |
---|---|---|
1. Scott Blewett | 45.01% | -0.98 |
2. Ryan Pepiot | 68.63% | -2.06 |
3. Brandon Eisert | 56.90% | -0.08 |
4. Sean Burke | 53.49% | -0.98 |
5. Bailey Falter | 56.74% | -2.37 |
6. Mitch Spence | 38.11% | -1.66 |
7. Antonio Senzatela | 60.22% | -0.86 |
8. Justin Verlander | 68.12% | -0.71 |
9. Patrick Corbin | 52.00% | -1.58 |
10. Chad Green | 46.17% | -0.23 |
Average | 54.54% | -1.15 |
Among sliders, we see a different trend. The comparing the top 10 pitchers by slider run value to the bottom 10 pitchers by slider run value, we see that the K-Score is lower for the top 10 pitchers (36.68%) compared to the bottom 10 pitchers (54.54%). However, the L-Score is also lower for the top 10 pitchers (-1.68) compared to the bottom 10 pitchers (-1.15). This sort of contradictcs a hypothesis I had that the top pitchers would have a higher K-Score and a lower L-Score. There is something to be said about pitch movement and how “unhittable” a pitch is. If we really wanted to quantify pitch tunneling, we would need to consider the movement of the pitches as well - but that’s a topic for another day.
Conclusion
Pitch tunneling has always felt more like art than science, a ‘you know it when you see it’ kind of skill. The goal here was to drag it out of the shadows of intuition and into the bright, unforgiving light of data. We started with two statistical tools, K-Nearest Neighbors and Log-Likelihood, not as perfect solutions, but as instruments to see what could be seen.
From these methods, we developed the K-Score and L-Score, our first-draft attempt at a true deception metric. They represent a new way to look at the game’s oldest duel: pitcher versus batter. But a new number on a spreadsheet is just that—a number. The real work begins now. These scores need to be tested against more seasons, more pitchers, and more outcomes. We need to see if the pitchers who score well are, in fact, the same ones who consistently make the best hitters look foolish.
This is less of an endpoint and more of a promising lead. The search is on for a reliable, quantifiable measure of a pitcher’s craft. If the K-Score and L-Score—or some future, more refined version of them—can prove their mettle, they could offer a new edge in evaluating talent, a new tool for pitchers honing their art, and for the rest of us, a new window into the subtle genius of keeping a hitter guessing.
Past articles: - Logistic Regression - Support Vector Machine - K-Means Clustering Github: - Running on Numbers