There is much research that suggests that people dont often get a sudden inspiration, but that they gradually buildup ideas over time. I was interested in one specific piece of this, how can we measure buildup or lack there of. Of course, it is easy to eyeball ideas or to ask people after the fact, but I was more interested in some more objective means to measure buildup. Specifically, what part of an idea is “original” and what part comes from the previous idea.
To make this concrete, consider a small toy example where a person is creating words from a given set of letters (say P, R, A, N, I, E). and they create two words PAIN and RAIN. Surely the second word was motivated by the first word. But is there an objective criteria that I can use to figure out how much buildup happened?
One approach might be to use size of matching substrings as a measure. The number of characters that are unmatched gives us an indication of the distance. Divide this by the total number of letters to get a normalized measure of distance between the words.
One problem with this approach might be that this is too conservative/noisy and fails to adequately catch semantically similar words (for instance, the words HE and THEY), and even phonetically similar words. So a more accurate (and less noisy) approach might be to use multile such measures and then aggregate them.