Basic Baseball Simulation

Jack Overby
8 min readNov 16, 2020

--

Are we in a simulation? Perhaps. If you don’t believe me, well, I’m the one who is simulating your reality. I specifically programmed you to be skeptical of this premise, therefore, your skepticism is actually proof that I’m right. Ha! Got you! Don’t try to fight it. I already know what’s going to happen. Embrace the determinism!

Okay… moving on to the article itself, I recall that one day, three or so years ago, I decided to try to whip up a baseball simulator. I was inspired by Win Expectancy Finder, the wonderful site created by Greg Stoll, which holds a database of every single regular season MLB at bat since 1957 (currently 130,860 games!) and uses them to calculate the probability of a team winning in a current scenario, based on how often teams have won, historically, from said scenario. Here are a few examples:

Start of a game: Top of the 1st, no outs, no one on: Home Team wins 53.87% of the time

Bottom of the 9th, 2 outs, 2 men on: Home team wins 83.53% (don’t tell that to the TB Rays, though!)

The main flaw with this method is that it doesn’t take individual matchups into account. It gives a good average, but if the best team in the league is hosting the worst team in the league, or vice versa, or if a team is starting its best/worst pitcher, the actual probabilities will look quite different! Thus, it would be nice to be able to plug in specific player probabilities (pitchers & batters), to get a more accurate picture of how lineups would match up against each other.

To start off, we’re going to take the overall MLB average and use those baselines for all players. Here is the breakdown for 2019, the last full season:

Results of 2019 Plate Appearances (186,518)

  • Singles: 25,947 (13.91%)
  • Doubles: 8,531 (4.57%)
  • Triples: 785 (0.42%)
  • Home Runs: 6,776 (3.63%)
  • Walks: 17,879 (9.59%)
  • Outs: 126,600 (67.88%)

We will give this probability distribution for all players. We’ll treat all outs as strikeouts (i.e. no tagging up or double players), assume all players advance 2 bases on singles and on doubles.

First, we’ll initiate a series of variables to keep track of the game state: inning, outs, score (home & away team run totals), bases (e.g. empty, loaded). This could and should be done via classes, but for the sake of this exercise, we can get the job done via simple functions and global variables:

let score = [0, 0];
let inning = 1;
let outs = 0;
let bases = [0,0,0];
let gameLog = []; // collection of game results from a given sim
let awayOrHome = 0; // 0: away team up; 1: home team up
let gameOn = true; // true: game still happening; false: game over

For those of you who don’t know the rules of baseball, here you go. For our purposes, we want the inning variable to increment at the end of each inning and reset to 1 at the end of the game. We want outs to increment at each out and reset to 0 at the end of each half-inning. We want bases to be updated and then reset as needed. So, we’ll design some functions to handle each at-bat.

Here is a function that randomly generates a number and assigns it a baseball outcome. Note: the random-looking integers are the cumulative sum of the aforementioned batting results.

function atBat() {
const outcome = Math.random() * 186518;
if (outcome < 126000) return 'Out';
else if (outcome < 144479) return 'BB';
else if (outcome < 170426) return '1B';
else if (outcome < 178957) return '2B';
else if (outcome < 179742) return '3B';
else return 'HR';
}

Here’s a helper function to sum the elements of an array:

function arraySum(array) {
return array.reduce((sum,element)=>sum+element,0);
}

Here is a function that handles walks:

function walker(bases) {
bases.push(0);
bases[0]++;
for (let i = 0; i < 3; i++) {
if (bases[i]==2) {
bases[i]--;
bases[i+1]++;
}
}
score[awayOrHome] += bases.pop();
}

This mutates the bases according to baseball walk rules and returns the number of runs scored; a team scores on a walk only if the bases are loaded (i.e. if all bases are occupied by players prior to the walk).

Here is another function, to handle one of the hits (single, double, triple, home run):

function hitter(hitType) {
if (hitType === '1B') bases = [1,0].concat(bases);
else if (hitType === '2B') bases = [0,1].concat(bases);
else if (hitType === '3B') bases = [0,0,1].concat(bases);
else if (hitType === 'HR') bases = [0,0,0,1].concat(bases);
const runs = arraySum(bases.slice(3,bases.length));
bases = bases.slice(0,3);
score[awayOrHome] += runs;
}

Finally, here is a function to handle all at-bats, regardless:

function atBatResult() {
const myAtBat = atBat();
if (myAtBat === 'Out') outs++;
else if (myAtBat === 'BB') walker(bases);
else hitter(myAtBat);
if ((inning >= 9 && ((outs >= 3 && awayOrHome === 0) || awayOrHome === 1) && score[0] < score[1]) ||
(inning >= 9 && outs >= 3 && awayOrHome === 1 && score[0] > score[1])) {
gameOn = false;
}
if (outs >= 3) {
if (awayOrHome === 1) {
inning++;
}
outs = 0;
awayOrHome = (awayOrHome + 1) % 2;
bases = [0,0,0];
}
}

Finally, here is a function that runs atBatResult() in a loop until the game is over, i.e. until gameOn equals false. Once the game is over, it pushes the result (0: away team wins, 1: home team wins) to the gameLog array and resets the game state.

function playGame() {
while (gameOn) {
atBatResult();
}
let result;
if (score[0] > score[1]) {
console.log(`Away team wins ${score[0]}-${score[1]}!`);
result = 0;
}
else {
console.log(`Home team wins ${score[1]}-${score[0]}!`);
result = 1
}
inning = 1;
outs = 0;
awayOrHome = 0;
bases = [0,0,0];
score = [0,0];
gameOn = true;
return result;
}

Let’s go through and make sure the sim is outputting reasonable results:

[...Array(5).keys()].forEach(i=>gameLog.push(playGame());
console.log(gameLog);

Which outputs the following:

Away team wins 2-0!
Away team wins 9-2!
Away team wins 3-1!
Home team wins 10-4!
Home team wins 8-4!
[0, 0, 0, 1, 1]

Hooray! These are reasonable, baseball-looking scores, and the results in the gameLog correspond to what we see onscreen. That is, the home team is 2–3.

As a final step, let’s modify our function so that we can input the starting game state that we want to simulate, e.g. bottom of the 9th, down 1, bases loaded, 2 outs:

function setGameState(inning=1,outs=0,awayOrHome=0,bases=[0,0,0],score=[0,0]) {
let gameOn = true;
return playGame();
}

This function will set the game at the specified state (with default values corresponding to the start of a game) and play it out until a result is reached.

As it turns out, the best to implement this simulator is to avoid all the messy global variables and do it OOP. That’s right, with classes, states, instances, inheritance and all that good stuff! After many painful hours tinkering, and in some cases simply blowing everything up and starting from scratch, here is what I got:

class Game {
constructor(params) {
this.inning=1;
this.outs=0;
this.awayOrHome=0;
this.bases=[0,0,0];
this.score=[0,0];
this.gameOn=true;
this.setGameState(params);
}
atBat() {
const outcome = Math.random() * 186518;
if (outcome < 126600) return 'Out';
else if (outcome < 144479) return 'BB';
else if (outcome < 170426) return '1B';
else if (outcome < 178957) return '2B';
else if (outcome < 179742) return '3B';
else return 'HR';
}
arraySum(array) {
return array.reduce((sum,element)=>sum+element,0);
}
walker() {
this.bases.push(0);
this.bases[0]++;
for (let i = 0; i < 3; i++) {
if (this.bases[i]==2) {
this.bases[i]--;
this.bases[i+1]++;
}
}
this.score[this.awayOrHome] += this.bases.pop();
}
hitter(hitType) {
if (hitType === '1B') this.bases = [1,0].concat(this.bases);
else if (hitType === '2B') this.setGameState({bases: [0,1].concat(this.bases)});
else if (hitType === '3B') this.setGameState({bases: [0,0,1].concat(this.bases)});
else if (hitType === 'HR') this.setGameState({bases: [0,0,0,1].concat(this.bases)});
const runs = this.arraySum(this.bases.slice(3,this.bases.length));
this.setGameState({
bases: this.bases.slice(0,3),
score: this.score.map((elm,i)=>i===this.awayOrHome ? elm + runs : elm)
});
}
atBatResult() {
const myAtBat = this.atBat();
if (myAtBat === 'Out') this.setGameState({outs: this.outs + 1});
else if (myAtBat === 'BB') this.walker();
else this.hitter(myAtBat);
if ((this.inning >= 9 && ((this.outs >= 3 && this.awayOrHome === 0) || this.awayOrHome === 1) && this.score[0] < this.score[1]) ||
(this.inning >= 9 && this.outs >= 3 && this.awayOrHome === 1 && this.score[0] > this.score[1])) {
this.setGameState({gameOn: false});
}
if (this.outs >= 3) {
if (this.awayOrHome === 1) {
this.setGameState({inning: this.inning + 1});
}
this.setGameState({
outs: 0,
awayOrHome: (this.awayOrHome + 1) % 2,
bases: [0,0,0]
})
}
}
playGame() {
while (this.gameOn) {
this.atBatResult();
}
const result = this.score[0] > this.score[1] ? 0 : 1;
this.setGameState({
inning: 1,
outs: 0,
awayOrHome: 0,
bases: [0,0,0],
score: [0,0],
gameOn: true
});
return result;
}
setGameState(params) {
Object.assign(this, params);
}
}class Simulator {
constructor(startState) {
this.setBaseState({
inning: 1,
outs: 0,
awayOrHome: 0,
bases: [0,0,0],
score: [0,0],
gameOn: true
});
this.setBaseState(startState);
}
setBaseState(newState) {
Object.assign(this, newState);
}
simulate(its=100) {
let wins = 0;
for (let i = 0; i < its; i++) {
const baseState = JSON.parse(JSON.stringify(this));
let game = new Game(baseState);
wins += game.playGame();
}
console.log(`The home team won ${wins} out of ${its}, for a winning percentage of ${wins / its * 100}%!`);
}
}

To test this out, let’s do a few sims, with 1000 iterations. We’d expect about 50%, given that the simulator treats both teams equally and gives no home-field advantage:

let sim = new Simulator();
for (let i=0;i<5;i++){
sim.simulate(1000);
}
// Result
The home team won 493 out of 1000, for a winning percentage of 49.3%!
The home team won 506 out of 1000, for a winning percentage of 50.6%!
The home team won 492 out of 1000, for a winning percentage of 49.2%!
The home team won 493 out of 1000, for a winning percentage of 49.3%!
The home team won 509 out of 1000, for a winning percentage of 50.9%!

Pretty good! All the results are hovering right around 50%, well within the margin of error on both sides.
Now, let’s try the other scenario captured in the screenshot above and see if our results are close to 16.47% for the home team (i.e. 100% minus 85.53%):

sim.setBaseState({inning: 9, outs: 2, awayOrHome: 1, bases: [1,1,0], score: [1,0]});
for(let i=0;i<5;i++){
sim.simulate(1000);
}
// Result
The home team won 185 out of 1000, for a winning percentage of 18.5%!
The home team won 188 out of 1000, for a winning percentage of 18.8%!
The home team won 185 out of 1000, for a winning percentage of 18.5%!
The home team won 181 out of 1000, for a winning percentage of 18.1%!
The home team won 193 out of 1000, for a winning percentage of 19.3%!

Hey, not too bad! It makes sense that these sim results would be a little higher, since teams typically put their closers in at the end of the game, who are more efficient on a pitch-by-pitch basis. “Hey, if that’s the case, why don’t teams just use their closers the whole game?” Here’s why.

Conclusion

I hope you enjoyed the demonstration! My main takeaways are the following:

  1. Avoid global variables; use classes and OOP principles whenever possible.
  2. Seemingly simple things, like baseball rules, can be quite tricky to convert from intuitive understanding into code.
  3. It’s a lot easier to spend hours and hours on a problem when you enjoy the subject matter!

Until next weekend!

--

--

Responses (1)