Updated: Feb 9, 2021
Leveraging bibliometrics we will rank board game designers, artists, and publishers in an attempt to quantify their individual impacts on the hobby.
People have always strived to determine which thing is the best of the things. But, given our lack of impartiality, we often defer our judgement to algorithms which after ingesting data, help us decide which things are best. Be it the Billboard charts, the Tomatometer, Nielsen ratings, the New York Times Best Seller list, or the preeminent board game list, BoardGameGeek’s Top 100, rankings are the easiest way to compare items within a shared sphere. Generally, these lists rank creations rather than creators. For instance the Billboard Hot 100 is based on song plays and sales, and NYT Best Seller List is measured purely on sales. Neither explicitly ranks musicians or authors, but both can be leveraged to those ends. We will similarly use board game ratings to rank board game contributors (designers, artists, and publishers), and we will leverage bibliometrics to do so.
Bibliometrics is the application of statistical measures to analyze publications (generally scientific), with the ultimate aim of quantifying a person’s impact within a field of study. Generally, the measures are a relation between publications (of a specific researcher) and citations (made by fellow researchers). Over a series of articles, we will adapt bibliometric methodologies to compare board game designers, artists, publishers, and board game collections and play histories.
To internalize the methods used for bibliometrics, we must first examine the metrics themselves and understand their foundation; the citation graph.
The citation graph contains all of a researcher’s published works. The horizontal axis shows individual publications, each represented as a vertical bar. The height of each bar represents the number of times that particular publication has been cited. The graph is then arranged with the most cited works on the left, and lesser cited works to the right.
The primary bibliometric index is the h-index, named after Jorge E. Hirsch who first suggested the metric to determine the relative quality of theoretical physicists. The h-index is not the result of an equation, but rather marked by point along the citation graph. The h-index is defined as the maximum value of h such that the author has published h papers which have each been cited h times or more.
For example, if we have a researcher with 5 publications (A, B, C, D, and E) with 9, 8, 6, 2, and 2 citations, respectively, their h-index is 3 (left chart). This is because they have three different publications (namely, A, B, and C) with at least 3 citations each. In contrast, if publication D gained two additional citations (4 total), then the researcher’s h-index would increase to 4, because the fourth paper now has four citations (right chart).
Represented graphically, the h-index is the largest square which can be drawn within the citation graph; a region called the h-core (shown above in a simplified straight-line model). The areas above and to the right of the h-core (referred to as excess and trailing citations, respectively) are not accounted for in the h-index, which has been pointed to as the major shortcoming of the h-index.
For example, each of the hypothetical charts above have the same h-index, despite having varying amounts of publications and citations. Marek Gagolewski, of Deakin University School of Information Technology, went so far as to formally identify the h-index bias toward certain publication styles. Gagolewski categorized researchers into three distinct types; the perfectionist, the moderate, and the producer (see below) . The perfectionist has fewer publications, but each is frequently cited. The producer is the opposite with several publications which are cited less often. The moderate is a mix of the two types. Imagining a scenario where each model has an equal number of total citations, the moderate model yields a larger h-index.
This bias has spurred the creation of several complementary metrics which together give a more complete representation of different publication styles. Generally speaking, the perfectionist model is the most respected of Gagolewski’s propositions. The producer represents a more shotgun approach, whereas the focused perfectionist model is more desirable and indicative of quality.
Perhaps the most popular h-index alternative is the g-index. The g-index is defined as the maximum value of g such that an author has published g papers which have cumulatively been cited g2 times or more. Put another way, the g-index measures the number of publications that have a cumulative citation average of g. For example, to have a g-index of 5, a publisher would need to have their 5 best works cited at least 25 times in total, (ie. the top 5 articles would have an average of 5 citations each).
By its definition, the g-index will always be greater than or equal to the h-index. Since the g-index accounts for all of the excess citations, but only a portion of the trailing citations, the g-index favors the perfectionist model.
If an author has a small number of publications, it is common for their h- and/or g-indices to become saturated. Meaning the author’s indices are limited by publications rather than citations. Over time it is likely that citations will eventually become the limiting factor, but this early publication threshold creates an h- and g-index bias toward senior authors.
The senior-bias of the h- and g-indices is compounded by the fact that citations can only increase over time, causing both the h- and g-indices to grow over time as well, further exaggerating the senior bias. To account for this bias, the m-index (also known as the m-quotient) was introduced. The m-index is the h-index divided by an author’s research tenure (measured in years). This metric assumes that citations accumulate evenly year to year, but generally annual citations diminish as research ages, resulting in an over-correction favoring young authors.
Google Scholar is an online tool for searching scholarly literature. In addition to enabling a simple scholarly search, the tool also calculates a variety of bibliometrics, one of which Google introduced; the i10 index. The i10 index simply counts the number of publications with ten or more citations, making it one of the simplest bibliometrics. The i10 index could be adapted to any fixed citation count, and can help identify a researcher’s h-index ceiling.
The AR-index is an age-dependent metric that considers core and excess citations. The metric takes the square-root of age-weighted citations for each publication contributing to the h-core. Unlike most other metrics explored thus far the AR-index can decline over time, and provides a view of recent citation intensity. The AR-index helps identify which authors have a large and recent impact. It is the bibliometric best suited to determine which authors are trending, similar to the BGG concept of the hotness.
Another time dependent metric is impact factor, which attempts to measure the influence of an author within a specified time range. An author’s impact factor is the average number of citations for publications made within a specified span in time. The impact factor can be measured over any time period, but is generally measured over the previous two or five years of publications. Since the impact factor only considers publications within a limited window, it is much more volatile than other metrics, and can fluctuate wildly based on the considered timeframe.
Clearly there is no bibliometric silver bullet. Any single metric taken in isolation provides limited insight, because each has its own bias. A balanced analysis requires consideration of multiple metrics in unison.
Bibliometrics can be adapted to areas outside of academia by identifying suitable proxies for publications and citations. For instance, Youtube creators often use h-index to measure their influence, using views and video count in lieu of citations and publications. However, Youtube videos are viewed thousands of times, so the view counts are reduced by a factor of one hundred thousand before use in calculations.
Applying bibliometrics to board gaming requires similar changes. First a proxy for citations must be identified. BoardGameGeek (BGG) is the best data source option, since it is the single greatest hub for public board game data. BGG measures ratings, owners, total plays, page views, and a myriad of other factors for each game in the database, but each of these metrics have their own shortcomings. For example, every BGG visitor contributes to page views, but only registered users can submit ratings and flag ownership, and a smaller subset of users log play records.
Unfortunately BGG restricts which measures are publicly available through their API, further limiting the proxy options. Having analyzed the BGG database, we identified ‘rating count’ as the most indicative measure of a game’s ranking, making it a good proxy for impact and influence as well. A game’s rating count represents the number of registered BGG users who have rated a specific game (regardless of the rating value). Rating count generally measures in the 10,000 range, so it will be reduced by a factor of 1,000 before use in calculation.
In addition to the biases of the various indexes, there are biases in the data as well. First off, BGG is only 20 years old, so the data for games prior to that period, though available, is not as robust as more recent titles. Furthermore, the hobby gaming industry has been growing increasingly large, and as a result recent games garner increased attention, and thereby more ratings, further skewing the indexes toward recent releases.
In the BBG database multiple designers, artists, and publishers can be attributed to any given game, but we will ignore this and grant full credit to each contributor, rather than splitting citations evenly across each contributor. Additionally, BoardGameGeek tracks which games are reimplementations of preexisting games (or part of the same family), but we will ignore this property due to the inconsistent use of the classifications. Thus two nearly identical games are essentially identical, they will be considered unique for the purposes of this analysis. However, the system does distinctly categorize expansions, which we will not include expansions in our analysis; base games only. Each of these decisions impacts the analysis, but each was made to best remove our own particular biases, and those of BGG admins, from influencing the results.
Bornmann, Lutz, et al. “A Multilevel Meta-Analysis of Studies Reporting Correlations between the h Index and 37 Different h Index Variants.” Journal of Informetrics, vol. 5, no. 3, 2011, pp. 346–359., doi:10.1016/j.joi.2011.01.006.
Egghe, Leo. “Theory and Practice of the g-Index.” Scientometrics, vol. 69, no. 1, 2006, pp. 131–152., doi:10.1007/s11192-006-0144-7.
Hirsch, J. E. “An Index to Quantify an Individual's Scientific Research Output.” Proceedings of the National Academy of Sciences, vol. 102, no. 46, 2005, pp. 16569–16572., doi:10.1073/pnas.0507655102.
Hovden, Robert. “Bibliometrics for Internet Media: Applying the h-Index to YouTube.” Journal of the American Society for Information Science and Technology, vol. 64, no. 11, 2013, pp. 2326–2331., doi:10.1002/asi.22936.
“ISI 5-Year Impact Factor.” American Psychological Association, American Psychological Association, www.apa.org/pubs/journals/5-year-impact-factor.
Jin, Bihui, et al. “The R- and AR-Indices: Complementing the h-Index.” Chinese Science Bulletin, vol. 52, no. 6, 2007, pp. 855–863., doi:10.1007/s11434-007-0145-9.