Stop Leaving Your Success to Luck: A Researcher's Guide to Hacking Randomness

Share This Post

If there are a million people, then there will be someone who flips heads 20 times in a row. Even when there is no difference in how many chances people are given and no difference in how lucky they are, and the playing field is perfectly equal, some people will still end up looking lucky.

A superstar may not be a superstar because of their ability at all. They may simply be the person who happened to flip heads 20 times in a row.

In this essay, I will explain why that is not the whole story, why randomness can be controlled, and in particular, how to carry out research while working well with randomness.

I will focus mainly on how to reduce variance, that is, how to reduce risk. Risk increases even if you know nothing, but reducing risk requires knowledge. Put differently, reducing risk is something you can achieve if you have knowledge, and it is a reproducible skill. On the other hand, increasing risk skillfully is a matter of taste. My goal here is for you to learn how to reduce risk, and then decide for yourself, based on your own convictions and instincts, where you want to take risks.

To fight randomness, there are three weapons.

The law of large numbers: if you increase the number of trials, randomness matters less.
Optimizing the order of random events: fix the random outcomes first.
Improvisation: once the random outcome is revealed, ride it.

I will explain each of these in context.

Let us walk through the actual process of doing research and producing a paper, and think about what kinds of tricks can lower variance along the way. I am a researcher, so I will talk about research papers, but I think these methods are useful in other kinds of business as well.

First, decide on the research topic that will form the basis of the paper. It can be anything, so just pick something. The topic does not need to be special.

To begin with, what topic you happen to encounter is a matter of luck. Of course, some topics are better than others to some extent, but no matter how much you struggle or pray, you are not going to somehow reach a better topic than the cards already in your hand.

People often say that good research is “amateur thinking, expert execution.” The idea part can be “amateur thinking,” meaning there is not much you can do there; the part you can really optimize is the execution.

The biggest reason is that it is better to run many research projects. Whether any given research project goes well depends heavily on luck. The “number” of research projects is the only factor you can reliably control, so it is better to choose a topic quickly, finish quickly, and accumulate more projects. If you increase the number of projects, the law of large numbers starts to work in your favor.

The “luck-based superstar” from the beginning is weak against large numbers. Someone who flipped heads 20 times in a row may not be able to do it 30 times in a row. In that case, all you need to do is flip the coin 30 times, or 100 times, and accumulate more heads.

You definitely improve as you do more research projects. I have written more than 20 first-author papers, and even now I still feel myself growing every time I write one.

Beginners have a narrow sense of what even counts as a “research topic.” Those topics tend to be crowded and competitive. Topic collisions are likely to happen.

As you improve, your strike zone widens, and you become able to pick up edge-case topics that nobody else would touch and still make them work. Once that happens, you no longer have to worry about running out of topics, and you can avoid competition and gain stability.

Rather than agonizing too much over your first topic, it is better to start quickly, finish the research quickly, and build your ability. Once you have the ability, good research will come naturally, and your stability will improve as well.

You definitely improve as you do more research projects. What matters especially is that your strike zone widens and your improvisational ability improves.

For example, if I asked all of you to give me three machine learning topics at random, and say they were adversarial learning, LLM as a judge, and bandits, and then told me to write a paper in one month on a topic combining those three, I would be confident that I could do it.

Likewise, even if the experimental results are bad, or the theorem you originally expected turns out to be unprovable, you become able to take whatever materials you currently have and still shape them into a good paper.

If you can write a good paper from any topic and from any result, then even if the research trend changes, you will not lose your footing. That gives both financial and psychological stability.

For example, I once wrote a book called Accelerating Deep Neural Networks. The original Japanese manuscript that this book was based on began because an editor contacted me and asked whether I would write a book on accelerating deep models. I said yes immediately. At the time I replied, I did not yet have a concrete story or landing point in mind, but I was confident in my ability, and I believed that no matter how things turned out, I would be able to make it into a good book. In the end, I centered the book around “the relationship among acceleration, compression, and generalization” and finished it in a little under a year. A book is 300 pages long, so finding a good landing point is harder than for a paper, but as I wrote, I found the right place to land through improvisation and brought it to a successful close. Improvisational ability comes from experience.

Now let us return to the topic of paper themes. I said you should pick anything and just get started. That is the most important point, and in principle it is fine to choose based on pure feeling, but if I had to give one criterion from the standpoint of lowering variance, it would be this: “Prefer a decent idea you had a year ago over the best idea you came up with recently.”

Research moves quickly these days, so younger researchers in particular tend to jump on the newest idea they have just come up with. But if your work depends too much on fresh ideas, your research will not be stable. A similar paper may come out first from a bigger group, or by the time you finish, the trend may already have moved on.

A common failure mode in research is that an idea that seemed wonderful when you started begins to feel boring a year later, or even halfway through the project.

Brand-new ideas look attractive, but they also fade quickly. Such ideas are likely to feel boring a year later. Research is a long-term battle measured in years. The initial excitement does not last for a full year. Managing your enthusiasm over the long term matters.

A decent idea you had a year ago has already survived one year, so there is reason to expect that it will still hold a reasonable amount of energy a year from now. It is also common for an old idea to rise in your estimation once you actually try it and think, “This is more interesting than I expected.” Since your ability has improved since then, you may find new insights in it. If even with cool eyes it still seems “decent,” then once you light the fire, it often leads to unexpectedly good places.

You may worry that if you let an idea sit for a year, a rival will beat you to it. But if an idea is the kind of thing where a few months matter that much, then taking it on in the first place is already a high-gamble move. Letting it sit also has the advantage of helping you avoid that gamble.

The quality of an idea involves a tradeoff between (1) newer ideas tending to be better, that is, having higher expected value, and (2) older ideas tending to be more stable, that is, having lower variance. Most people think only about (1), but if you also take (2) seriously, your work becomes more stable. Also, expected value is easy to misjudge, while differences in variance are often substantial.

When you are a beginner, you may not be able to think of even a single research topic in the first place. In that case, it is good to decide the constraints randomly. You might call this a challenge run. You can choose a direction such as “the opposite of my previous work,” “the one that would maximize citations,” or “the kind of thing Professor X would like,” or you can choose constraints randomly as in a three-word storytelling prompt.

The reason to set constraints is that when people have too many options, they postpone action. If you wander into a jam shop and see 30 kinds of jam on display, you may look around with interest, but you will probably think, “Maybe next time.” If there is only one kind of jam, you can just buy that jam. The important thing is to act and to choose a topic. To make that happen, force the options to narrow early and push yourself into motion.

Another good thing about introducing random constraints like this is that it helps you avoid producing the same kind of research every time.

It is also good training for improving your improvisational ability. If you repeatedly practice finding a path forward under constraints, then even when strange random outcomes appear, you become able to find a path through them.

For example, the study Word Tour (NAACL 2022) came from my thinking, “I have never written an NLP paper, but it would be interesting as a test of my ability if I could get a solo-authored paper accepted to an NLP conference without relying on an NLP specialist.” Of course, that was not the whole story, but it is true that this constraint was one element in how I chose the topic. Even if your motivation is unserious, it is fine as long as you do the work properly.

Now that the topic has been chosen, the next step is to decide the axis of the research.

The basic flow for research, talks, books, and so on is as follows.

Decide the topic (or receive one)
Come up with 10 claims
Among them, choose the main claim you most want to communicate
Decide on an axis that runs through that main claim
Along that axis, select, discard, and generate claims

At the moment the topic is fixed, neither the axis nor the main claim has been found yet. As I have said, the topic may even be chosen randomly. This is important. Topics whose axis is obvious from the beginning are not interesting, either to you or to the audience. They are too predetermined. The value comes from the time you spend struggling to find the axis after the topic has been fixed, and then from discarding what is unnecessary once the axis has been found.

To make the topic work, think deeply about it and produce 10 claims. As you engage with the topic, you will begin to see many things that were invisible before you started. Whatever those things are, what newly comes into view has value. If anything moves you even a little while you are touching the topic, make sure to record it carefully. It will be good raw material for building claims. There is no need to worry that the fact itself may not be novel. Any fact, depending on how it is seen, can become a valuable observation. A genius is not someone who sees some new fact that only they can see. A genius is someone who notices the importance of an old fact that everyone else saw as well, but failed to appreciate. If something moved your own mind, it will probably be able to move someone else’s too. Do not be intimidated. Go ahead and build the claim.

Once the claims and the main claim are decided, you determine the axis and organize the claims around it. If you cram every possible claim into one paper, the result becomes blurry and hard to understand. By keeping only the claims that align with the axis, the paper’s outline becomes sharper and easier to communicate. For example, this essay itself contains various claims, but by organizing them around the axis of randomness and developing the discussion along that axis, I am able to give the whole piece coherence.

A paper only needs to spend 10 pages communicating this main claim. The experiments, the theorems, the other messages, and the writing itself are all means for communicating that claim. If all you need to do is spend 10 pages conveying one or two sentences’ worth of message, then it should be easy. You have read hundreds of papers by now, but you probably remember 90 percent of them only vaguely. If you can carve even one sentence into the reader’s mind, you are already in the top 10 percent. In short, aim to imprint the main claim into the reader’s memory.

Once the axis is set, proceed with the research along that axis.

I start with preliminary experiments. For a theoretical paper, I first prove the main theorem. There is a school of thought that says you should write the paper first, or at least write the introduction first, but I do the opposite. This is because I want to optimize the order of randomness: the random outcomes should be fixed early.

You do not know what kind of results an experiment will produce until you run it. Depending on whether the outcome is good or bad, the tone of the introduction and the arguments you need to make will change.

For example, suppose you have a six-turn game, and the last three turns are determined by randomness. Then you have to do work while imagining every possible random outcome, which is exhausting, and in the end the result is still determined by luck.

By contrast, if the first three of the six turns are determined by randomness, then you only need to deal with the one realized outcome that actually occurred. That is much easier. Also, because you can control the way you finish the game after that point, the overall process becomes more stable.

In other words, the more uncertainty something contains, the earlier it should be placed.

The same applies to paper writing. Experimental results, which are highly uncertain, should be fixed first, and then any recovery should be done through writing, which is something you can control more easily.

If the experiments do not go well, or if the main theorem cannot be proved, then settle for easier experiments or easier theorems. Experiments and theorems are only means for supporting the main claim. You are probably the only one who is attached to “that” particular experiment or “that” particular theorem; even a simpler experiment may be enough to persuade the reader.

A good way to design good experiments and good theorems is to imagine what you would say if someone told you, “That claim is false.” Think of how you would reply: “But look, this is why.” Then translate that reply into experiments or theorems. Since you yourself believe the claim, there should be words of rebuttal available no matter what someone says. Give those rebuttals concrete form.

If even then the experiments or the proof still do not work, it is often best to give up cleanly and move on to the next topic. You definitely improve as you do more research projects, and for that, it is better to run many projects. Rather than clinging to a single project, move on cleanly.

That said, it is good to output what you have done so far in some form. Ideally this would be in the form of a paper, such as a technical report, but a blog post is also fine. In any case, make sure the community can see what you did.

This matters because it helps you avoid getting stuck in a pattern of endless abandonment and never producing any output.

Also, whatever you put out will help the community. Other people are bound to think about the same things you did. Telling them about the pitfalls you fell into may save them. Also, if unsuccessful random outcomes are hidden, publication bias is created, which is bad for the community. To help prevent that, publish it whatever it is. That helps both you and others.

For example, when I started the project that became Re-evaluating WMD (ICML 2022), what I really wanted to do was develop a cool method based on WMD. But the experiments went terribly, and I had to give up on developing the method. In the end, I thought carefully about why it failed and wrote the paper around that. You could say that I used improvisation to turn the worst possible outcome into a “playable” one. Even if experiments go badly, even if the random draw is bad, you cannot afford to get discouraged. Use your improvisational ability to make use of any random outcome.

When you actually write the manuscript, it is important to write all the way from beginning to end, even if it is rough. Dump what is in your head in automatic mode and complete what people call a vomit draft. Grammar mistakes are fine. Missing citations are fine for the moment. Just finish the whole thing as fast as possible.

The reason is that when there are too many options, people postpone action. Until the first draft is finished, there are too many uncertainties, which makes you start exploring too many possibilities, and before long the whole thing becomes unmanageable and you invent all kinds of excuses for procrastination. A blank manuscript amplifies anxiety. What matters is to force the uncertainty into a fixed form first, and only then improve it locally. That is the key to producing a high-quality manuscript quickly. Local improvement is easier, and because it moves forward steadily without excuses, the final result is usually better as well.

Writing is about balancing in-distribution and out-of-distribution elements. In-distribution means what an experienced reader can more or less predict will come next. Out-of-distribution means unexpected things, reversals, surprises, and new information.

A paper has to communicate something the reader does not already know. So all of its claims need to lie out of distribution.

If everything is in distribution, the paper becomes uninformative and boring. But if everything is out of distribution, communication breaks down. The balance matters. The exact ratio depends on the topic, target audience, and medium, but my image is about 90 percent in distribution and 10 percent out of distribution.

If the font in an Amazon product description looks suspicious, you start to doubt the quality of the product itself. The same is true for papers: if the English is strange, or if the structure does not follow convention, then the legitimacy of the message itself is called into question, making the paper harder to persuade with. Follow the conventions of the community, and use the expressions and structures that readers are accustomed to. In-distribution writing is a problem that can be solved by next token prediction, so if you read a lot of papers and train on them, you will get better. First of all, read a lot of papers.

Then, if you simply line up out-of-distribution claims that are aligned with your axis, the result will usually be a good paper that can be accepted at most venues.

Once the paper is complete, it is finally time to submit it. Peer review is highly random. It is exactly the theme of this essay. Many people have probably had the experience of seeing a paper they were sure would be accepted get rejected.

In an experiment at NeurIPS 2021, some papers were reviewed independently by two separate committees, and of the papers accepted by one committee, more than half were rejected by the other.

Getting accepted is luck. Getting rejected is luck too.

The only real way to push back against this problem is, for the most part, the law of large numbers. The “number” of research projects is the only factor you can reliably control. If you increase the number, the law of large numbers starts to work in your favor. As I have said many times, it is better to choose topics quickly and finish them quickly.

Also, you definitely improve as you do more research projects. When I first started research, my acceptance rate was around 20 to 30 percent. Now it is around 50 to 70 percent. I can feel that my ability has improved. If you are currently troubled by the fact that your papers do not get accepted, then if you simply keep doing many research projects, I believe you too will eventually start getting papers accepted.

If a paper is rejected, the basic move is to resubmit and redraw the random number.

That said, if you have no motivation left, giving up is also an option. The worst outcome is to come to hate research, so do not push yourself too hard. But you should still output it somewhere. Maybe a less competitive venue, or at the very least a preprint. Make “do not overstrain yourself” the first condition, and then search for a compromise line.

Do not forget that publication in a venue is for publicity. There are plenty of other ways to publicize your work as well: giving talks at different universities, promoting it on social media, doing SEO, and so on.

After acceptance, always promote your work.

Suppose you have a paper that 100 people will read. From there, compare these two options:

(A) Write one more paper that will be read by about the same number of people
(B) Increase the readership of this paper from 100 people to 200 people

The latter is overwhelmingly easier.

From the standpoint of efficiency, you should take promotion seriously. Another important point is that after acceptance there is no longer any externally imposed randomness like peer review, so this is something you can control much more easily. This is where you should properly absorb the damage randomness may have done earlier.

My paper "Training-free Graph Neural Networks and the Power of Labels as Features" has been accepted to #TMLR 🎉

I proposed training-free (and optionally trained) GNNs.

Paper📜：https://t.co/J6rOQrGejo
Code📁：https://t.co/gEzmwu5N48 pic.twitter.com/CsBJ8Sxak7
— Ryoma Sato (@joisino_en) August 20, 2024

A method I recommend is posting a poster-style image that summarizes the work clearly. It will not be wasted later, because you can also reuse it when making a poster or slides. One important trick is to attach it to your acceptance announcement tweet. At the moment of acceptance, people are more likely to like or retweet it partly to say congratulations, so it spreads more easily and more people learn about your work. If you separate the timing, then the acceptance announcement gets only congratulatory likes and people do not actually learn the content, while if you post the poster at some random later time, fewer people see it. Timing matters.

This is not a method I would recommend to everyone, but it is also possible to pay for advertising. I personally spend my own money to advertise my research. I am not so much recommending advertising itself as I am saying that you should think about promotion that seriously.

(A) Write one more paper that will be read by about the same number of people
(B) Increase the readership of this paper from 100 people to 200 people

Once you take research funding and your own labor cost into account, option (A) costs a fair amount of money. Trying for (B) through advertising might in the end be cheaper.

What matters is to keep repeating the whole process above. As I have said many times, the law of large numbers matters. Choose topics quickly, finish quickly, and accumulate projects. You definitely improve as you do more research projects.

Up to this point I have treated randomness almost like an enemy, but some amount of randomness is still necessary for excitement.

If horse racing were deterministic and you paid 10 dollars for a betting ticket, watched the horse run, and then received 9 dollars back regardless of the result, nobody would bet. Even if the expected value is 9 dollars, people buy betting tickets because there is randomness.

The same is true in research. Research whose result is visible from the beginning is not interesting. Eliminate every avoidable risk as thoroughly as possible, and then use the room that creates to take risks on essential challenges. If you do that, I think you can have excitement, stability, and research success all at once.

Once you gain knowledge, risk can be minimized. Use the law of large numbers, optimize the order of randomness, and make use of improvisation. Control randomness skillfully, and build a stable research life.

Author Profile

If you found this article useful or interesting, I would be delighted if you could share your thoughts on social media.

New posts are announced on @joisino_en (Twitter), so please be sure to follow!

Ryoma Sato

Currently an Assistant Professor at the National Institute of Informatics, Japan.

Research Interest: Machine Learning and Data Mining.

Ph.D (Kyoto University).

View Profile

Share This Post

Stop Leaving Your Success to Luck: A Researcher’s Guide to Hacking Randomness

Author Profile

Thanks for reading!

Recommended next

Author Profile

Sign up below to receive our latest articles via email.