I am Jake Vance and I have always wanted to write a book but I hate reading so why would I make something I wouldn’t ever want to look at, so I made this website instead. I would like to start by thanking Dr. Jayasekare because I wouldn’t be able to make this website without the knowledge I learned in her classes. My main goal with this site is to change the way people play and think about the game of baseball. I am incredibly competitive and I want to find new ways to give my team an advantage without stepping over the line that the 2019 Astros or the 1919 White Sox crossed. I want to do this by using probability, creativity and machine learning. I graduated with honors from Butler University in 2020. I majored in both Actuarial Science and Statistics and minored in Business Administration, Data Science, and Mathematics. I would like to go back to school and get a masters degree in Data Analytics or Data Science because after learning the wide variety of topics, throughout my career at Butler, I know there is still an infinite amount of information out there that I still haven’t discovered. It would have to come with a good price though because you can find and figure out almost anything you want if you put time and effort into it and a masters degree just makes it easier to find the information and forces you to put time into it.

In the summer of 2019, I had a job where I would travel to professional soccer games and record the stats for each game. I’m not the biggest soccer fan in the world but being able to watch very skilled players for free and getting paid to pay attention while sitting at mid-field was pretty awesome. I definitely got some weird looks being on my phone the entire game and there was a scary instance when the team’s mascot singled me out and started dancing on me because I was staring at my phone in order to keep stats. There are, no doubt, people that have similar jobs for baseball and other sports like hockey and basketball but what I want is a job analyzing the data they collect, especially for a MLB team. The analytics teams, in the MLB, are generally around 10 people so there’s around 300 positions and I want to be 1 of those 300 and knowing that I’m not going to be the head of the department right away lowers my odds even further. Also considering there’s no reason to hire right now since the 2020 season is only 60 games and some teams have already bailed on the 2020 season, I have to leave this as just a dream for now. I started my data analytics career as an intern for my university’s baseball team and am now working as a data analyst for an insurance company. I’m not going to say I started the analytic team for Butler Baseball because I wouldn’t have known about it if it wasn’t for my adviser Dr. Holmes and I wouldn’t have been selected without Professor Knoderer who are both reasons why someone should go to Butler. Also there wouldn’t even be a position if the coaches weren’t open-minded about the future of technology in baseball. I was the first person that was given the opportunity to help the team by simply looking at the data they had captured mainly from rapsodo at first. In the beginning, there was a mutual confusion about what I should be doing because it was a first and the first run through shouldn’t be the best. The first two weeks into the internship I was only asked to attend two scrimmages where I simply kept the stats using Rapsodo and that happened at the beginning of the two weeks. After that they didn’t ask me to do anything probably because they found someone else to do what I was doing because it was pretty simple. At this point, I started researching things that I should be doing because nothing was not going to be on that list. Since the only data we had was pitching data I looked into pitching metrics that would say things about the pitchers. I came up with three projects that could be useful for analyzing Butler’s pitchers, developing the pitchers and helping them get the most out of their own tendencies. The first project was all about fastballs. I pulled the data off of Rapsodo, eliminated outliers and created boxplots showing the range of speeds for all the pitcher’s fastballs. I then calculated each pitcher’s Bauer Units which is simply \[Total Spin(RPM)/Speed(MPH)\] A higher number generally represents a harder fastball to hit because there is more movement and when a fastball curves it goes up, albeit very slightly. Therefore pitchers with higher Bauer Units will induce more pop flys throwing higher in the zone and lower Bauer Units will induce more ground balls if thrown lower in the zone. The cool thing about this project is that it helped show why some pitchers are naturally ground ball pitchers because my numbers aligned with what the coaches already knew which was music to my ears.

The second project I did was all about change-ups, and maybe a little bit more about fastballs but to know how good a change-up is you need to know information about their fastball. There are three important factors in determining how good someone’s change-up is. The first is the difference in speed between the fastball and change-up. Usually it is calculated with a simple subtraction showing the difference in speed but to normalize the data across all the pitchers I chose to divide that difference by the speed of their average fastball. A much simpler way to get the same answer is \[1-(Changeup(MPH)/Fastball(MPH))\] This should give you a percentage between 7%-12% if you are working with MLB pitchers but for Butler the range was between 6%-11% with the higher number being a more difficult change-up to hit. The second important factor is how closely it looks to a Fastball. For this I looked at the average release angles of each pitcher for fastballs and change-ups. I used a little bit of linear algebra to calculate the difference in armslots with a two-dimensional vector showing the left-right distance and the up-down distance. This vector showed how far away the average change-up release angle was from the fastball’s. A (0,0) vector would be ideal, meaning that both pitches look the same coming out of the hand. I used more algebra to calculate the last important factor which is movement. I used the distance formula and the average horizontal and vertical movement to find the average total movement which ranged from 12-20 inches. The goal of this project was to show ways that the pitcher could improve their change-up and also help the coaches find where the problems were. Similar math and analysis could be used with any pitch that’s not a fastball.

The last project I completed in the off-season had the objective of determining command along with help determine which pitchers would be best against certain batters. The idea behind it was that if we needed 1 out and a batter wasn’t good at hitting inside pitches we could best determine which pitcher should be put in. It should be the pitcher that throws inside with the most ease. There are obviously more factors that should be looked at like which pitcher he saw last and how he did against them but this should be something to look at when determining the next pitcher especially if you don’t have one guy that can almost guarantee you an out on any batter. This would have been better if I could have had each pitcher throw a set amount of pitches in each of the 9 parts of the zone but I was just an intern so I used the data that we already had. I created a spreadsheet with all 17 locations and the corresponding percent of pitches thrown in each location. I also calculated the percent of pitches thrown in the ozone layer which in simple terms is the outer part of the strike zone and is where every pitcher should be aiming. I made the chart interactive so that if you wanted to find the pitcher that hits the bottom left corner of the strike zone the most you could set the importance of each location to 0 and set the location you want to 1 and the highest number in the total column would be the pitcher to put in. After this project I tried to find more things to do that could expand the arsenal of possibilities. I would have liked to look at the hitting data but the Rapsodo absolutely hated working for the batters. I looked into using a regression model to predict how high school players will perform in college. This one was an absolute bust mainly because finding high school statistics for division I NCAA players turned out to be a lot more difficult than it should be. Another student and I searched for two to three hours and found around 10 players with high school stats and those stats could have easily been buffed by their high school coaches. I do see this being useful at higher levels of baseball where stats are more accessible like say how a college player will perform in the majors or how a AA player will perform in AAA.

Butler’s 2020 season started on Valentine’s day and a couple weeks before that I was given my first actual assignment and it turned out okay to say the least. I was tasked with creating a scouting report but for our team. In future seasons, I think this should have been done some time in the off season so that it can be improved upon until it’s the way that is desired throughout the season and so that the team can see what other teams see in order to make their weaknesses strengths. We did end up getting the Synergy database somewhat late so there’s not much that we could’ve done because this database was essential for the reports. I was given a list of around 30 things that the coaches wanted to see and I used some of my own discretion to help build the reports. Most of it was simply querying the database to find the stats they requested but my favorite part was going through each batter and finding things like “doesn’t swing at 80% of first pitch strikes” or “misses 70% of curveballs” or “85% of strikeouts were breaking-balls thrown down or away, out of the zone.” I would first look at the whole picture and then start dissecting the data by count, pitch, location, and what the result was. After that, I simply looked for big numbers or numbers that I felt could be useful in creating a strategy for attacking the opposing team’s batter. I believe this was the most important and influential part of the scouting reports because knowing what your opponent is likely to do allows you to be one step ahead of them and being a step ahead of someone in any competitive game is going to give you an advantage. I would also like to add that I had another student working with me named Justin Walthers that was a complete life-saver, I would not have been able to do nearly as much without him. He ended up doing the hitting side of the reports which focused on the pitchers we would see. For example, how they perform against different batters, what they are likely to throw on each count, and where they typically throw pitches. Without him doing this, I would not have been able to put as much effort into the pitching side of the report. There was one last part of the internship that I am thankful that the coaches included us in. Every Tuesday we would meet to talk about the upcoming weeks games. This at first was just to present the scouting reports and see if the coaches had any questions or things we could improve upon. It later, and I’m really grateful for this, turned into a meeting to discuss where we would position the fielders for each batter. Obviously Justin and I didn’t have a ton of say on the final results of where we were going to position the fielders but it was great to be in the discussion. I really appreciated it a ton when the pitching coach, Ben Norton, would go out of his way to ask us what we thought. He was kind of thrown into and interim head coach position because of some very unfortunate happenings to the head coach and he obviously had way more experience than both Justin and I in terms of baseball knowledge but I think he realized that we were nerds when it comes to looking at numbers and charts and he knew that more opinions were better especially when they come from people that think differently because in the end if he didn’t like what we had to say, it could just be ignored and that would have been perfectly okay with us. As students, we were just happy to be in the room listening to how our university’s baseball coaches made decisions and how they prepared for games because that’s a really cool experience that only two people have ever gotten at Butler and I am very grateful for that. I will link a pdf of one of the scouting reports below so that you can get an even better idea of the work we were doing.


email:

Scouting Report