Google Ngram for martial arts historical trends?

Question: How responsive is Google Books’ Ngram analysis to real historical trends in budo?

Google Ngram tracks trends in the frequency of word usage in published books.  More details are available from Google Books.  Theoretically, the Ngram analysis should measure the popularity of certain words in daily discourse and, thus, reflect historical changes and events as expressed through our use of language.  For example, the Soviet Union was formed in 1922 and dissolved in the early 1990s, so the Ngram analysis should show the use of the term “USSR” starting sometime in the mid-1920s.  In fact, this is what it shows.  Here is an Ngram analysis of the terms “USSR” and “Russia” during the 20th century:


What you can see here is that the use of the terms accurately reflects historical trends in Eurasia and its relationship with the Anglosphere.  “USSR” does not appear until around the time of the creation of the Soviet Union.  Between the creation of the Soviet Union and 1940, there is no association between the terms as writers are probably still used to referring to Russia by its old name.  During WWII, when Stalin was an ally of the English-speaking world, use of the term “Russia” rose considerably.  Then in 1945 there is an abrupt change as Stalin changes from ally to foe.  After 1945, there is a clear inverse relationship between “Russia” and “USSR” and, in the 1990s, “Russia” begins to rise again in popularity following the Soviet Union’s demise.

Here is another Ngram analysis showing pejorative terms for Japanese people.


The terms “Jap” (short for Japanese) and “Nip” (short for Nippon) were used during WWII as pejorative references to the Japanese.  Note how the Ngram shows use of the terms rising sharply starting in 1940, peaking in 1945, and then dropping again after the Japanese were no longer enemies of the Anglosphere.  Note also that the Ngram is sensitive enough to distinguish between “Nip” as a pejorative and the normal English word “nip,” which shows no WWII-era bump in usage.

Terms like “USSR” and “Jap” are easy to track with Ngram analysis because they are related to major historical events that would necessarily impact many genres of writing from fiction to journalism to technical manuals.  However, for minor historical events or insignificant phenomenon, Ngram might not work so well.  Imagine, for example, that there are 20,000,000 people in the English-speaking world who share some hobby*.  It’s likely they will know each other, have clubs, magazines, etc, and thus develop a common terminology that can be measured by Ngram.  Now imagine there only 20 people who share some hobby, 5 living in Australia, 5 in England, 5 in California, and 5 in New York.  Although they do the same thing, they might not know each other and thus have developed completely different words to describe what they do.  In this case, Ngram would have no way to track the existence of this hobby through the English words in publications.

Further complicating the matter, Ngram does not search every English publication in existence, only the ones Google Books has access to.

So, the question is, can Ngram possibly tell us anything about the practice of budo in the English-speaking world?

To try to validate Ngram, I searched on the term “judo.”  I chose judo for several reasons:

  • unlike kusarigama-jutsu 鎖鎌術, judo is actually practiced by more than 20 people in the English-speaking world
  • judo is the oldest popular budo in the West, so it should have references going back far in the Ngram database
  • unlike jujutsu, which is also transliterated as jiujutsu, jyujutsu, jujitsu, etc, judo is easily and consistently written in Latin script

My first step was to go to website and get information on the history of judo.  Here is some of what it had to say:

Ngram can search books back to 1800.  Judo was created by Kano Jigoro in the 1880s, so Ngram should be able to show us the beginnings of judo as well as the developments reflected in the above history.  Let’s see how it does.  Here is an Ngram search on the term “judo” during the 20th century (1900-2000).


As you can see, there was very little interest in judo in English language publications until about 1910, when it suddenly exploded.  This was the time when Kano became a member of the International Olympic Committee.  Interest in judo then began declining in the mid-1910s through the late 1910s.  This is the period of WWI, when there was probably little concern in English publishing with a strange hobby from the East.   It remained low through the great depression, which makes sense, and then began to rise rapidly in 1940, when–as the judo history above tells us–it started being taught for self-defence during WWII.  Following WWII, it was up and down until 1960, when it started rising consistently, at the same time as and following, according to the history, judo’s first appearance in the 1964 Olympics.

So, Ngram seems to do a good job of reflecting real events in the history of judo.  I would say, for a hobby historian, Ngram is validated as a way of researching budo.

Now, for fun, let’s look a little closer.  Here’s Ngram on the interest in judo in Germany between 1920 and 1960:


As you can see, under the Weimar Republic, Germans had little interest in judo, but when they began to militarise under the Nazis, it really took off*, only to plummet again as they began to lose the war.  Unlike Germany, which started preparing for war early, interest in judo in the English speaking world didn’t take off until the war began.  By using Ngram to look at differences in British English and American English, we can see that the British started writing about judo all of a sudden in 1939, when they could see the writing on the wall, whereas American interest in judo rose more gradually throughout the war.  In addition, British interest in judo plummeted after the war, whereas Americans, high on being winners, interested in their new little brother in democracy, Japan, and ramping up for the Cold War, continued to have increasing interest in the art.


Finally, let’s see how Kodokan judo has faired against Brazilian jujutsu since the 1990s.  I would expect that the interest in BJJ would have really detracted from the more traditional art, but…


Instead, it appears that BJJ has resulted in–or least coincided with–a little revival of interest in Kodokan judo.  Go figure!

So, there you have it.  If you enjoy investigating the history of budo, I encourage you to use Google Ngram, which seems like a valid way of examining trends in martial arts.

If you find any cool budo information using Ngram, please drop me a line in the comments or e-mail me.

Also, one thing that’s a huge pain is that the iframe that Google generates for embedding the Ngram chart doesn’t work in WordPress. If you know how to do this easily in WordPress, please let me know.



One response to “Google Ngram for martial arts historical trends?

Comments are closed.