Intrigued by Randomness: Mahjong, Crappy Machine Learning, and Gatcha Games
Humans are pretty drawn to randomness. That’s probably a well-known fact, but I’ve just come to understand on how true it is, and I want to talk about it with regards to some of my personal experiences, as presented in the title.
Mahjong is a complex game. I’ve seen elders gather in community space and play it, but never had any idea why the tiles held by winning players make them win. There are more interesting games to play though, so I didn’t try to learn it. That is, until the online game Mahjong Soul came out, which is just Japanese Mahjong with random players online. But the game is obviously targeted to my generation, with cutesy artstyle and player avatars being anime girls (they even added anime guys later). So lots of my friends started playing it, and pretty enthusiastic, too.
I tried to dabble in it, but the new player guide is filled with undecipherable jargon, unknown phrases explained by even more unknown phrases, 20+ different winning conditions, all intertwined. The rules overwhelmed me to no end, so I gave up. Until this year, when many of friends are still playing it often, they instigated me to play again. I said I can’t understand the rules, threw them bags of questions like "what does phrase X even mean?", in attempt to show them my inability to understand the rules. But somehow, this time, they actually had the patience to answer all of my petty questions. Over an hour later, being the one asking questions, I inevitably started to grasp the rules of Mahjong, or at least, quote one friend, "about 20% of Mahjong rules".
Wait, what, there’s more? That took such a long time and so much patience out of everyone, and I only learned about 20%? I exclaimed, "That just shows how complex Mahjong’s rules are!"
They suggested me to "just dive in, understanding will come naturally". But my experience disagrees. All I can do is play robotically or arbitrarily. There is no time in each turn to think everything through, most of the time I have no idea what I am doing. There is little feedback otherthan winning. If I take some action, it is very difficult to know whether that action is beneficial or harmful. When I lose, even if every action taken in that game are good decisions, I wouldn’t know. If I do win, what does that mean? Which decisions have lead to the win?
Comparied to other tabletop games like chess, Mahjong is much more complex. It requires such a high commitment, one needs to sit down and study it, just to be able to start playing. Whereas in chess, learning the rules is quick and feedbacks are easy to come by during plays. A piece is taken, that likely comes from a previous wrong move; I put the opponent in a difficult position, that probably means I did something right. To me, chess is a superior game than Mahjong.
But I missed a critical factor: Chess has zero randomness, it’s a game of pure skill (or in my case, at least blind luck). Mahjong is over halfly determined by chance, the starting hand is random, each tile is drawn from a shuffled deck. Skills do matter, but each player is still at the mercy of Fortuna. Better players are merely "more likely" to win, which only show up in a large number of games. In an isolated set of several games, worse players still win sometimes. This means, as some other friends pointed out, Mahjong is a great party game, the fun comes from the randomness. Indeed, if all that happens is weak players being crushed by strong players, it wouldn’t be fun for a group of people who have different skill levels. That happens with chess, and all other games that emphisize skills.
The luck factor also brings thrills to experienced players. Something unpredicted can always happen, anyone may have drawn a tile they need at any time, or not having drawn a useful tile for a long time. There can always be something new. I suppose that is why the elders in community space play it from day to night, seemingly tireless.
A couple of years ago (I can’t believe it has been that long), I wrote a chatbot, HoroBot. It runs in a couple of Telegram and IRC groups. It’s core functionality is to send random emojis whenever discussion is happending in the group. It is just for fun, seeing how merely sending random emojis can make a bot seem to blend in a group chat. This was a success, some didn’t even notice that it’s a bot hanging out amongus, for a long while. They joked that it has passed the Turing test. Later, I added more functionalities to it, one of which is detecting whether a chat participant is "of the bot’s own kind". Because Horo is a wolf, this feature is called the "wolf detector".
The way the wold detector works is based on a Bayesian classifier. Don’t worry, you don’t need to know anything technical. And neither did I. All I knew was that it’s a tool that lets a computer program to categorize sentences automatically. First, train the program with a set of sentences and what category each sentence is known to belong, then, with enough training, the program should know what a sentence in each category "looks like", so that when I give it another sentence, it will tell me its category. It’s a form of machine learning. Such things is often used for detecting spam messages, or whether user reviews to products is positive or negative. I wasn’t using it for anything that serious though, I wanted HoroBot to tell whether a chat participant is "of HoroBot’s own kind" or not.
For my application, there are 2 categories, "is" and "is not" (of HoroBot’s own kind). I trained the program with past chat log of the ArchLinuxCN group at the time. Messages sent by anyone with name containing "horo" is considered to be in the "is" category, otherwise "is not". It was obviously pointless, but the whole thing was just for fun. After training, whenever HoroBot sees another message, it will give a category of that message, and the categorization given to the last 100 messages sent by one user determines their score, i.e. "probability of being the bot’s own kind".
The results are, as expected, pretty useless. What does even mean to be "of HoroBot’s own kind"? And because there is no way of interpreting this score, there is no way of telling how accurate it is. It appears that most of the results are around 50%, which gives the suspicion that if I put everyone’s scores up, it’s going to be a normal distribution, i.e. the category of each message is pretty much random. I have not tested that suspicion, though. Actually, it’s potentially worse than random guesses, the "original horos" (people who have names containing "horo" in the chat logs I used for training) tried to see their score, and they were consistently lower than others!
I will confess now, that I do have some idea why my Bayesian classifier was so crappy. A Bayesian classifer needs a sentence to be a list of words, rather than a list of characters. I need a way of segmenting the sentences into words. In Chinese, there are no spaces between words, so segmenting is not trivial. I wasn’t bothered to to anything elabourate, so I just decided arbitrarily: every 2 characters is a word! That is clearly a terrible way of doing segmenting. It has probably made every sentence meaningless.
Incredibly, despite all that, this is the most popular feature of HoroBot! To this day, several years later, when I have not updated the bot in a long time, participants of ArchLinuxCN chat group still seem to see their wolf detection scores often. It puzzles me: It’s just a pointless, practically random number! Why have they not gotten tired of it yet! But well, I guess, I made something that people liked, it should be something to be pround of.
This is purely baseless speculation, but I guess one reason of the wolf detector being popular is it’s unpredictability. The result is not random, the same message always yields the same categorization, but over different messages, it appears random. Yet we know the score is somehow related to the messages a person has sent, while that relation is unknown. Such randomness intrigues us, as we are curious to see "Does sending this message make my score go up or down?" Psychologists say that humans loves finding patterns in random events. Here we do have patterns, how unsensible it may be, it’s somewhere within the algorithm, and knowing this makes us wonder all the more strongly.
I’d like to think of myself as a relatively rational person. Playing gatcha games is a thing to be looked down upon in my book. Here, gatcha games refer to video games that has subpar gameplay, and the main attraction is collecting "cards" which are obtained randomly upon paying microtransactions. The cards are usually depicted as lovable characters.
The profits of gatcha game makers is surely betted on our attraction towards randomness. Each time one pays for the microtransaction, they can be intrigued by the anticipation of their new card (or lack thereof, most of time in this type of games, getting a collectible card is a rare event, usually they are less exciting stuff). The problem is, being attracted to that means I become more likely to spend more money on the totally useless cards, and devote time to virtual characters who have absolutely no care for me (other than their creators wanting to extort for more money). This is different to being drawn towards playing Mahjong with friends, which at least practices friendship; or watching over a wolf detector score, which doesn’t cost much.
I would argue that playing gatcha games is a lot like drinking alcohol. Doing it a lot is definetely bad. Doing it a little is probably acceptable, but unhealthy nonetheless. To be fair, it is really difficult to resist paying for the card draws, so I resorted to just stay away from gatcha games. That resulted me in missing out on some games with gatcha mechanics but do have some content, unfortunately.