A DAY AT THE RACES WITH TEAMSQL
In this article, we took a public database with real and historical data about formula one races located at http://ergast.com/mrd/db/, we imported into a PostgreSQL database engine, and then we start querying using TeamSQL trying to cover and use a broad spectrum type of business queries and at the same time using different SQL clauses. We expect you can find some of the exciting queries presented here usable and applicable perhaps to your business data.
QUERY ONE – A VIEW TO DETERMINE DRIVER CHAMPION
Our first query is not very exciting; it is oriented to create a good foundation for next queries. The database we have is like a photo of what happened in every race with every pilot and every constructor. However, the database is not storing in a direct way which driver or constructor wins the championship every year. Naturally, as we have all the information in the database, we can calculate who was the champion. Then, that is what we will do by using views. We will create two views, one for drivers and one for constructors. Let’s see the following print screen with the drivers view creation and how to consume the view to obtain the champion in 2008
QUERY TWO – WORLD CHAMPION TITLES BY NUMBER
In this query, we obtain the number of times a driver won the champion title ordered by the number of times they won. The interesting point in this query is the list of years when the driver won. We obtain this list by using the function string_agg. We suggest you take this function present with you group records and need a concatenation of strings, one per record.
QUERY THREE – POLE POSITION BY NUMBER AND BUSINESS METRICS
Some of the historical data is not complete. In the case of pole positions, we have the qualifying results only since 1994. Anyway, this is an interesting query, to see how much times a pilot won the pole position. Moreover, as in every business we need metrics to measure how we are performing. Then I add some extra parameters in this query, to calculate how many races a driver won when he starts in pole position. I also calculate the rate, to compare different drivers. And you know who has the best “pole and win”rate? Let’s see the query
QUERY FOUR – WON A RACE, BUT NEVER CHAMPION
Winning a championship is hard, only a few drivers have this privilege. However many other drivers work hard in every race, perhaps some of them even won one race. This query is a tribute to those drivers who at least win one race, but he never won a championship.
QUERY FIVE – EVOLUTION OF AVERAGE LAP TIME BY WINNER ON FAMOUS CIRCUITS
This query is a way to use big data to discover facts over a considerable period. In this case, we created a query to obtain the average lap time (in milliseconds) of a classic circuit (for this analysis we choose Monaco and Monza). The query is the following:
If we use the chart feature in teamSQL, we can verify that there is a tendency over the years to reduce lap time. Remember, this analysis is based only on race winner times on one specific circuit.
In the next image we can see Monza Average lap time of winner evolution:
We covered with several queries, different business data questions showing the SQL used in each case. The main idea of this article was giving to the reader a toolbox of queries, so you can adapt and apply them in your business questions. SQL is not complicated, and TeamSQL do it more comfortable!