wdax wrote:Like most pople know, the evaluation und judging of kata in IJF-kata competition is based on the official Kodokan instruction matierials, which is the DVDs and the textbooks. This lead to the opinion and conclusion, that the highest scores are given to those, who "copy" the movements of the DVDs. This is of course nonsense, but here and there we find this idea in the circles of competitors as well as in the circles of those, who critique this in public.
With this post I will try to explain the philosophy behind the system of scoring and doing this, I want to give some background information usually not explained in public. To say the truth: I do not believe, that all international kata judges are familiar with the points I will explain below. The published results f.ex. of the WC in Kyoto proof that.
1.) Outline of scoring
There 5 judges, who give scores for the kata. The highest and lowest score is dropped, so there are the scores of three judges remaining in the final result.
There are points from 0 to 10 given for:
- opening and closing
- each of the 15 (resp. 20/21) actions
- overal flow
In case of Nage-, Katame- and Ju-no-Kata there are 18 marks by each judge, what is a maximum score of 180 points, which is max. 540 points in total of three judges. Kime-no-Kata and Kodokan Goshinjutsu include more techniques, so the max. score is higher.
2.) Criteria for scoring the techniques
If the judges give marks from 0 to 10, then they need a clear guideline about the difference between f.ex. a 7 and 8 or 4 and 5 points. If there is no such guidelines, scoring becomes a lottery of personal "opinions" and the system is getting completely corrupt. Obiously the Kodokan DVDs do not offer such a guideline, so the DVDs cannot be a guideline for scoring.
There is a very clear philosophy behind the scoring, although of course there are many problems to put it in practise, also because the same ideas should be applied to completely different kata. I will adress some of these issues later.
What or the basic ideas:
A) No matter how "accidetal" of wrong an attempt to execute something is - it must always be scored higher then a forgotten technique.
--> so the rules say, that the score for a forgotten technique is 0 and the score for any attempt is minimum 1 point. I think this is understandable and cannot be questioned.
B) A technique, which is completely wrong must always be scored lower, then a technique which is basically correct, but "only" lacks realism and effectivity.
--> the score for a completely wrong or failed technique (f.ex. harai-goshi instead of uchi-mata, dropping a weapon, being "killed" in KDG or Kime-no-Kata etc.) is something between 1 and 5 and additionally the max. score for overall flow is 5.
In other words. If a technique is basically correct, but lacks in realism and/or effectivity the minimum score is 5 without limiting the score for the overall flow.
I think this general rule can also not be questioned. Totally wrong is worse then ineffective....
C) An execution of an action, which is basically correct, but lacks in realism and/or effectivity must always be scored lower then a technique, which has some room for improvements, but which do not really effect the functionality of the technique.
---> so techniques which are bascially correct, but flawed in realism have to be scored between 5 and 7 points. The range from 8 to 10 points is reserved to executions without any real problems in realism and effectivity.
Typical flaws of realism/effectivity are wrong distances, not enough kuzushi, "reaction" before an attack, Uke moves voluntarily, to little dynamic of the action etc.
So we have clear guidelines:
0 --> forgotten technique
1-5 --> completely wrong or failed
5-7 --> basically correct, but flawed in realism
8-10 --> correct and effective technique
The Kodokan DVDs and the textbook only can be used as guidelines about what is basically correct - not less, not more.
The scoring sheet of IJF kata-competitions
Maybe it was not the best idea, but when the basic guidelines were ready, the task of making scoring sheets had to be done. The idea was to make crosses instead of writing marks. So the idea of substracing mistakes of different levels from the ideal of 10 was born.
Big mistake: a big mistake is defined as an action, which is completely failed of basically wrong. A big mistake is a substraction of 5 points from the max of 10. A big mistake can be combined with additional small and medium mistakes, so the max. score for a technique, that includes a big mistake is according to the above mentioned general rule 5 points or less.
Medium mistake: If there is a medium mistake in an action, the max. score is 7 (see above). So medium mistakes lead to a substraction of 3 points. Attention: there can only one medium mistake scored in a technique, because two medium mistakes would be a substracion of 6 points and the score lower then 5. This is not possible according to the general guideline. So the question is: was there a lack of realism or not. This is a problem in evaluating nage-no-kata (where right and left techniques are scored comprehensively) and ju-no-kata with its longer sequences of attack and defence, where we have a lot of "chances" to make medium mistakes....
Small mistake: a small mistake is not really a mistake, but an "imperfection". It´s something not really perfect, but without effect on the realism of e technique. A small mistake is a substraction of one point and each judge can score a maximum of two small mistakes for each technique (a third one would equal a medium mistake and therefore something that lacks realism - this would be against the general guideline, so it is not possible).
Small mistakes without a medium mistake result in 8-10 points, in combination with a medium mistake the score is between 5 and seven.
There are of course some finer points:
Follow-up mistakes: very often, we find that one mistake results in a second one and so on. The general rule is, that only the biggest of these should be scored. For example, if the starting distance is not really wrong, but a little bit to short (little mistake) and that results in lack of kuzushi (medium mistake), which in turn results in (very little) problems with the final balance, then only the medium mistake should be marked and the score is 7. In contrast, if the distance is a bit short, but kuzushi is ok and there are minor problems with final balance, the score is 8 (to small mistakes). If the final balance is also perfect, then it´s a 9 (only one small mistake).
Typically in ju-no-kata the follow-up mistakes are a problem, but it only occurs, when there is already a medium mistake - resp. a flaw in realism.
Positioning: wrong positioning on the tatami is regarded as medium mistake, hat should be changed IMHO. Doing a correct thing from a wrong position should be scored higher the doing a flawed thing from the "correct" position. But these problems can be avoided, because those who enter kata-competition should be able to prepare themselves and find out, what the correct starting positions for each kata are.
What I tried to explain is the main reasoning behind the scores. BTW: the demonstrations seen on the Kodokan DVDs are not perfect in the sense, that they would be scored with 10 points for each technique. The average score would likely be around 8 points...
If one looks at the scores of international competitions, one finds out, that the cut for the medal-ranks is around 80% which is an average score of 8 points. So there are only very few people in the world at the moment, who can demonstrate a full kata without any "medium" mistakes. Of course there are a lot of other people believing, that they can do it, but a reality check usually opens their eyes.
Of course - effective technique and realism is not all about kata, but there cannot be "a better" kata, when technical flaws are repeatedly done and people do not work on them.
Wdax,
Thank you for doing this. Like you say, it is useful for both readers and kata judges.
It is a great experience for all us to hear you write about this with passion. As a successful kata competitor and a person who has submitted himself to judgment on his kata for many years has earned our respect. You are not just a talker but also a doer and someone who also teaches kata allowing others to benefit from your experiences.
As you will see I agree with almost everything you say, when you enter personal notes. So, I have no problem with you, but I do have a problem with the rules and system of rules.
As rational human beings we like to understand things, and thus written rules, especially if concise, may help that understanding. Only the judging experience has the same characteristics as a chain and is only as solid as its weakest link.
Let's take the IJF shiai refereeing rules. The description of what is an ippon score is pretty clear, and also fundamental to judo. It's one of the first words new judoka learn. If ippon is so clear so obvious, then how come it is constantly contested ? It is because situations deviated from textbook exaples, it is because referees have limits in understanding, it is because of nerves, it is because of prejudice, it is because numerous factors.
Few kata judges have studied kata like you have, yet all of them are supposedly certified to judge you. How is that possible ? Sure, there is no requirement that one has to be able to perform things as well as the performer in order to be able to judge them. I can quite in depth judge and explain why Wilhelm Kempff's Beethoven is so special, and I can also say how it differs from that of Wilhelm Backhaus whose Beethoven is equally brilliant, yet different from that of both Kempff and Edwin Fischer. At the same time, it also illustrates that it is possible for three different performances to be near perfection. Yet, there is only one and the same music score for each piece. How is this possible ?
While I know I can do that, there are also occasions where friends of mine are exposed to music and where conversations arise during which it is quickly obvious they have no clue and can't distinguish. Why ? Obviously a lot of training and study.
Back to kata. There are going to be judges who despite not having your level who may be able to judge you properly, and there are going to be others who don't.
But the question is "how do we know?" And how can we use that knowledge to make differential decisions in who can and should be a kata judge.
The system of selection is problematic too, since it is not open. I am not an IJF kata judge, yet I have studied and researched kata very extensively. I have people in Japan whom I call my "friends" people whom I trust and I know from extensive interaction that their knowledge outweighs that of most teachers at the Kôdôkan and in the IJF. They are not kata judges either, are not even Kôdôkan instructors or much associated with the Kôdôkan. Under the current system it is not possible to get in the IJF or EJU committees or be judges unless you are first nominated by your own national federation, which practically means that you have to submit to all the politics and all the brownnosing. And some people excel in doing that even though their skills are very limited, whereas others refuse to submit to that. The consequence is that your end product does not consist of the most capable people but of the people who are willing to pay lip service to the greatest extent. This is problematic and also a known problem in economics and business management.
If one fear people who are critical, the consequence is that you are going to end up with a deficient product because even if in your organization their are people who know about the flies, they are going to keep their mouth shut because primarily they want to be your friend and not lose their job. In consequence, the same problems still exist, but will only emerge when the product is public, and the criticism will come from the anonymous public who does not have and never received any saying.
When I was in the army and at one point working in the kitchen, I was disgusted by the cockroaches and substandards of hygiene. At one point we were told that there would be an inspection from some high-up bobo, like a general or something. So our colonel forced everyone to work day and night to make the kitsch so clean that you could eat off the ground. We were threatened that if they would be able to make any remark that the colonel would make our life into a misery and find a reason to lock us up. when the inspection came, it was judged that our facility was exemplary. As soon as the general was gone, the cockroaches came out again and it was back to the same. Now, why did no one step up to the general and say "general, this is all one big sham, our kitchen is disgusting and everday there are cockroaches everywhere, but the colonel threatened us to set up this piece of theater and no one dares to open his mouth out of fear for represailles" ? Why did this not happen ? Because people do not want to risk their own perks. You know as well as I do that if anyone I would probably have done precisely that even if I knew that I would be screwed. In the end also that did not transpire because I had no kitchen service on the day the general came and was not even there when he visited.
The IJF and EJU are the same. Unless you kiss ass, it is not possible to be in there. There is no one there who refuses to kiss ass and goes just for quality. I write this not as criticism to the organization but to show that even if your rules would be perfect, the enforcement can't be because of considerable people's flaws.
I will now stepwise address your points:
1) Scoring"There are points from 0 to 10". We know from other aesthetic sports that this is not really true and that scoring is skewed. In reality there are points from 6-8, unless something is so obvious, like skipping a technique. Now one could argue that it is only logical that scores are mostly 6-8 because likely people are mostly going to score in the middle with here and there a couple that is exceptionally poor and one that is exceptionally good. However, that is not necessarily true. You do not know beforehand what the statistical distribution is going to be and if quality indeed will have Gaussian distribution. It may or may not. The problem is that judges intentionally give marks that are skewed. The reason is that if marks are extreme chances that they will be identified as being off are substantial with consequences for their future. Besides, extreme scores are dropped based on the entirely speculative view that the more they deviate from the median the more wrong they are. No judge intentionally gives scores to achieve his scores being dropped. The rationale behind the skewing of scores is that the average score will be the likely most correct one. This is, however, wrong. If there are 5 judges and three of them give a 7, while one of them gives a 2 and one of them gives a 9, this rational dictates that the 7 is likely the most correct score because it is in the middle and because the frequency at which this score was given is three times as high as the frequency that a score of 2 or of 9 was given. Unfortunately that rational is simplistic and fundamentally wrong. In fact, it may well be that the score of 9 or the score of 2 was the most correct one, and that in fact that correctness would have been achieved by dismissing all other scores including the most frequently given one. That is because knowledge is logarithmic and not accumulative or serial. To put it simple, if you take a group of 50 physicists and you take Einstein and you ask a question about relativity, who is likely going to give the correct answer ? Einstein, period. Or to put it slightly different, if you take a group of 100 people with an IQ of 130 they can never intellectually complete with a single individual with an IQ of 180. You simply cannot serially link ability. Five judges are not more accurate than one. It is unpredictable. They may be better, they may be the same or they may be worse. It all boils down to skill and knowledge.
2) Criteria for scoring techniquesI agree with everything you say.
In subsection 'B' it is said that
a completely wrong technique must always be scores lower than a technique which only lacks realism of effectiveness. That sounds OK, but it isn't. Two reasons.
Suppose we do nage-no-kata, I am tori, you are uke, and we are about to start uki-goshi. You immediately attack me and hit me with your left fist. I notice this and appropriately respond throwing you with right uki-goshi. As you get up you realize your mistake and know attack with your right fist, to which I respond with lef uki-goshi. Thus this really deserve to be marked down ? I could argue that I showed a higher understanding than most others because you deviated from the prescribed pattern by switching sides and I still reacted properly. We also did not forget a throw or attack, nor did we mix the position of uki-goshi with that of harai-goshi. The term "mistake" is not even proper here since at no point was the principle and objectives of this kata violated, and it adhered to the prime objective of kata: to improve my judo.
In subsection 'C' you write about techniques lacking in realism that should be scored between 5-7 points. This may sound logical, but too is open to problems, in fact two problems.
Firstly, what every judge sees behind 'realism' greatly differs. We see the same here on the forum. Fair enough, the spread is probably large, with some people being relatively novice while others aren't and yes, we have the ubiquitous critics of us all cowardly hiding behind our screen names, no doubt. However, that is not the point, the point is what is this realism ? I argue that most jûdôka do not know what 'realism' in kata is. If they knew what it was they wouldn't be doing what they are doing. 'Realism' to them is what they know the people who are in a jury suggest they want. You can't seriously argue that any, but literally any koshiki-no-kata as it has ever been performed during the EJU continental kata championships or even today during the All Japan Kata Championships contains anything remotely connected to 'realism'. So, how then can they be assigning scores that go above 7 or even 5 because they completely lack realism and people's knowledg about what precisely realism in there is lacking. At least in nage-no-kata, yes, if one performs a 3rd kyû level nage-no-kata and strikes the opponent with a "weak hand" somewhere hitting a hole in the sky, sure, many will see that, but otherwise, it simply is not there.
There are a couple of Kôdôkan goshinjutsu videos from Kyôto online. It's outrageous, I was watching one and discussing privately with a couple of JudoForum members, and it struck me that uke was never out of balance. So, something as fundamental as kuzushi, and that at world level, and in advance kata, was totally absent. How can that be ? How can someone like that gain a medal, and why do scores exceed 5 ?
In the jû-no-kata this is very problematic too. I can count the couple of people on the fingers of one hand who seem to know what realism is in jû-no-kata. Jû-no-kata today is mostly, and you know this too, performed devoid of realims. It is performed as an aesthetic gymnastics exercise. I was at the Kôdôkan international kata summer school, I think in 2010 and there were two Romanian girls who received the highest score and a special certificate. There performance was absolutely awful. What they did was gymnastics, and actually they turned out to have both a gymnastics past. It was synchronized acrobatics, but with jûdô it had nothing to do. There were no attacks, no realism, it was a dance, a display to impress what they could do with their bodies, how flexible they were. Why were these people not simply failed ? I would have given higher marks to an 80 years olds with a hernia who shows how the exercise still contributes to his health, where there is true action/reaction, where you can see the atemi, the resistance, the attack, the defense, the response gô/jû.
So, one can write all one wants and say "5-7 basically correct, but flawed in realism", as long as this can't be filled in properly by people with the proper skills it is an empty carcass.
I also don't think that the lack of realism should imply that one just for that can't go lower than 5-7. When I watched the European Championships, or European 'Cup' Koshiki-no-kata, whatever I cannot for the life of me get those scores. When I score what I see the scores vary between 0-2. That is not a lack of realism where a score of 5 would still be justified. Impossible. It's a complete absence of comprehension
The scoring sheet of IJF competitionsI notice you write "the task of making scoring sheets had to be done". Why ? I can see that to some people it would sound logical, but that does not mean it is. Quantitative research is not the same as qualitative research, different measures, different ways of expression. Kata is essentially a qualitative experience, not a quantitative one. However, what occurs is that the IJF strangely enough attempts to implement quantitative evaluation of qualitatative processes. This is not possible. The problem, obviously, is that the qualitatative appreciation is very difficult to standardize and very difficult to compare across different couples. If I evaluate three couples as follows:
A. This was superb, excellent expression of jû.
B. This was fantastic, excellent expression of realism.
C. This was awesome, excellent action/reaction.
Who now should be the winner, couple A, B or C ? But if I give couple A 9.5, couple B. 9.25, and couple C. 9, there is no doubt, and it is made easy. Only, can I be sure that what I wrote in A, B, and C is accurately reflected in the numbers I listed ?
I very much like the fact that despite being an active competitor you show the spine to be critical about the rules. That deserves a lot of courage knowing that always someone could retaliate. I find again that in your criticism you are very close to mine !
PositioningYou are right, again. "Wrong positioning on the tatami is regarded as a medium mistake". Why ? What evidence is there to say that wrong position is a medium mistake ? What is the IJF's basis for such dogma ? When is a positions 'wrong' ? When you are 30 cm off the desired spot ? or 1 tatami or 2 meters, or turned to the wrong side ? From what point does it become a "wrong positioning" ?
The example you mention is correct, and I agree with you. There are other concerns. Is a 'different' position detrimental to the kata or does it follow from natural action/reaction ?
If you are doing katame-no-kata and have just completed kuzure-gesa-gatame and you decide to move all the way to tô-ma, get up and go sit at uke's head for kami-shihô-gatame and suddenly realize that you actually want to do kata-gatame, well, then that is obviously a problem and a mistake.
Luckily, nage- and katame-no-kata have rather simple and strict patterns, so it is hard to be really in the wrong position unless tori goes stand in uke's spot or the other way around, and in jû-no-kata if you are not in the right position you usually can't do the technique. So, really wrong position are an issue in kata with either very liberal patterns (goshinjutsu) or very complex patterns (koshiki-no-kata, but the latter is not yet on the IJF program).
The conclusion you arrive at is, I think, often true, but not the only one. You write "one finds out, that the cut for the medal-ranks is around 80% which is an average score of 8 points. So there are only very few people in the world at the moment, who can demonstrate a full kata without any "medium" mistakes". That is not entirely true. What is missing there is : "who can demonstrate a full kata without --what the judges deem to be-- a medium mistake". Besides knowing the mathematical and statistical problems that skew the marks make the actual number of 80% at which you arrive invalid. What it really means is that there are people who can do these kata at a significantly different level than others, but those who can with a 90% level of expertise and those with a 70% expertise likely are both also forced within the group of 80% whereas they shouldn't. You can only make the conclusion you propose IF the 80% actually accurately reflects a level of 80%. It doesn't.
You mention "Of course there are a lot of other people believing, that they can do it, but a reality check usually opens their eyes." (...)
You are probably correct, BUT it depends on what that reality check is. Vladimir Horowitz used to say he was irritated by critics and didn't need the as he knew well enough what he did himself and was a far harsher critic of himself than people imagined. The simple matter of carrying out a "reality check" is not so easy to achieve. You have been very successful in kata competitions, especially jû-no-kata. But what if I would say, "well, we are going to do this somewhat differently this time. You will compete, but not with your usual partner. Instead, I will determine your partner and introduce you to him 2 hours before your competition". Would you accept that as a basis for a reality check on your level of jû-no-kata ? You do realize that the partner I picked out for you is a blue belt of 140 kg, who has done the kata only twice before. I've been there in scenarios perhaps not that extreme but still not common. Years ago I jumped in to help out someone from another country far more junior and whose partner needed to undergo emergency surgery. We didn't medal, I thought it was very satisfactory that we got that far, but nevertheless some people tried to use the incident to suggest that after all this proved that our kata skills weren't so good as we wanted people to believe. Maybe they are right. I am not sure what I want people to believe when it comes to kata skills, but no doubt my own kata skills are not as good as I want the to be. In fact, there is nothing I am as good in as I want to be. My point is "reality check" required a check of reality and not a set-up horror scenario either. There are incidents known where the great Italian pianist Arthuro Benedetti Michelangeli after entering the stage started taking his piano apart, played a couple of notes, took it apart again, and then got up and left to never return leaving behind a stunned audience. It happened actually more than one. When I was living in Japan, Michelangeli was announced to perform in Ôsaka, so I got ready to purchase tickets to attend one of his extremely rare performances. Only, Michelangeli never played. In consequence, Japanese customs confiscated his piano for breach of contract, and Michelangeli vowed to never set foot again in Japan, and he never did. For him as an artist of a supreme level, Japan's surgical and cold economic reaction was beneath the understanding of what a true artist represents. Does the fact that he either did not show up or only played a couple of notes allow us to consider this as a "reality check" that he couldn't really play the piano very well ?
No doubt kata as a demonstration or performance rarely is in conditions that are ideal for everyone, but they should at least reach a certain level, and people should receive some level of accomodation if you want a "reality check" to really reflect "reality" and not "sabotage". Besides that, you also must consider that for some this "reality check" simply is no longer possible due to old age and physical impairments.
Some years ago I was on a nage-no-kata course with Abe Ichirô at that time still 9th dan. I was helping out together with Satô Tadashi, and at one point Abe wanted to correct kata-guruma, so he wanted to do it but couldn't get up. He tried again, but had to give up. Is that a "reality check" that he can't do nage-no-kata ? Could we if supposedly he had demonstrated the whole kata assign him just a "medium mistake" or a "big mistake" ? After all if you do the same during your contest, you will be given such a penalization, no ? So, Abe couldn't do it while most of us can. Does this then mean we are all much better? If it doesn't mean that, then what does it mean and how do you correctly address that an calculate that in. By the way, if I were a judge that time and I would have seen Abe do it, I doubt that anyone else would have gotten a score as high as him. You know why ? His body position, use of his hara, kuzushi, timing, it was all there. He couldn't complete it, but there is no way one had to correct his position. I remember him also performing ô-soto-gari as part of sode-dori in kime-no-kata. No one of the performers I usually see in kime-no-kata performs it so correctly in terms of body position, center of gravity and other things. It was jûdô, plain and simple.
One could to the same extend elaborate about some of Mifunes's performances. After all there are times he steps back with his left foot in stead of his right, times he realizes he starts walking too soon, and steps back again, all kinds of things. So are these all mistakes "small" or "medium" and can we then legitimately concluded that most of our kata couples today all perform their kata much better than Mifune ? I assure you they don't, but one also has to have reached the level that one can see beyond those mechanics and take them for what they are. They are not even relevant anymore at that level. Who cares whether he steps back first with his left or right foot. How would that even be relevant to the core or objective of the exercise. The stepping is a convention, and implementation of rational but for the rest it is irrelevant to the core of the kata.
As long as the judging system is not equipped to distinguish these issues, and have the people who have long transcended the constraints of the mechanical approach to kata, it can't be evaluated and constantly produces false results. one could say that that result is winning or not winning a medal, but it is not limited to that. If the objective of kata is indeed improving someone's jûdô then such erroneous conclusion holds judgement on the level of the jûdôka involved. In this case that would imply that most people who participate in kata today are than much better jûdôka than Mifune, since few will ... "make the mistake of" stepping back with the left foot first.
That being said, as you know, when we are critical, we often get angry responses saying "can you do it better". So, it is only fair that we look for improvements rather than to be just negative. I think that some problems can be resolved, but I think others can't as some of the problems are inherent to the nature of the beast.
When I look back at the phenomenon of kata competitions and want to make an evolution I come to the following conclusions. I too competed in kata, but I found it very frustrating. You know why ? I could not find any partner available whose kata level I deemed sufficient, or they already formed part of a known kata couple and were thus "occupied". The offers I did receive I rejected, all of them. You know why I started competing ? I did to create and external motivation towards having to know every possible detail and research everything about it. Even though I competed in it far less than you, I did achieve the motivational part and researched kata to the bone. To some extent I recognize the same in you. your participation in kata competition has brought with a whole plethora of different things and you actively research, discuss, communicate, teach. But I don't think that your development in response to participating in kata competition is how the average people develop. I know many other kata performers besides yourself, and while all of them are devoted, and practice and want to win, I don't think I know of many other people who have grown so much due to this activity. If the effect was the same in everyone, I would probably embrace it more. At the end of the day, I don't think that what I observe really is the effect or merit of kata competition but of yourself. Why ? Well, you already developed as a very good competitive jûdôka when you were still fighting shiai. I see this more as a product of personality and having been exposed to the right teachers than that this is the merit of kata competition. In other words, even if kata competition would not have existed, you probably would have developed in this field in a way far above average too. It's all speculation, but it is still what I believe.
Last edited by Cichorei Kano on Wed Nov 27, 2013 11:57 am; edited 1 time in total