So to Speak podcast transcript: ‘The Half-Life of Facts’ with Samuel Arbesman

Note: This is an unedited rush transcript. Please check any quotations against the audio recording.

Nico Perrino: All right. Well, Sam, thanks for coming on the show.

Samuel Arbesman: It's my pleasure. Great to be on.

Nico Perrino: I was reading your biography and it says you are a complexity scientist. What does that mean?

Samuel Arbesman: So, complexity science is basically the study of complex systems which doesn't actually tell you that much unless we can define what complex systems are. And complex systems are basically large systems that have a huge number of interacting parts, that all – and this can be whether in biology, it can be parts of a cell, it can be an entire organism, it can be an entire ecosystem, it can be within technologies, a lots of technological systems interacting like the internet, it can be within social systems like societies or cities, things like that.

But the idea is that no matter what the complex system, it turns out there are underlying similarities to a structure of the network of interactions, the underlying behavior, potentially even sometimes the emergent behavior that arises. That's actually one of the hallmarks of complex systems is this idea that due to the interactions of these large systems, of the parts of large systems, you might get behavior that you might not necessarily be able to predict simply by looking at the individual interactions.

So, for example, the behavior of a beehive might be hard to understand just by looking at an individual bee. Or the flocking behavior of birds might be hard to understand just by simply looking at the behavior of an individual bird. And so, there's kind of this emergent behavior that arises, and it can be studied in a computational and quantitative way. And that's sort of, what complexity science is very broadly.

Nico Perrino: And you argue in your book, The Half Life Effects that knowledge creation and deterioration can kind of be analyzed in this way, as well. Right?

Samuel Arbesman: Yeah, yeah. So, the idea is that – that knowledge creation and kind of the way we think about facts and information, if you look at an individual fact or bit of information, you don't necessarily gain that much, but when you look – but when you take a step back and look at the overall production and generation of information and knowledge and how errors get rooted out and things like that, suddenly there are actually regularities.

And so, and the book sort of comes from this sort of analogy of the half life of radioactive decay that when you have a radioactive material – if you have a single atom of uranium, for example, it might decay in the next fraction of a second, it might decay in millions and millions of years. You really can't predict it. But if you get a whole bunch of uranium atoms together, if you get an entire chunk of uranium for example, suddenly the radioactive decay goes from being unpredictable to actually quite regular.

You can actually graph out this half life of decay. And the same kind of thing, at least on some level, is true with how we think about knowledge. That when you look at knowledge as a whole suddenly, there actually regularities to how we think about how knowledge grows, how it spreads from person-to-person, how it gets rooted – how error gets rooted out, how it gets overturned. Things like that.

Nico Perrino: And who started – I believe you call it or there's a name for this sort of study. It's like scientometrics, right?

Samuel Arbesman: Yeah, yeah. So scientometrics is basically the science of science. Kind of like quantifying how science changes over time. And there's also we call, there's a metaknowledge or – people say science, science as well, but I mean people have been thinking about this for a while. I would say the – probably the – at least one of the founders is this historian of science and physicist Derek de Solla Price who actually compiled a huge number of quantitative ways of understanding scientific change.

He looked at the growth in scientific journals, the growth and change in publication, and the story goes that one of the ways he initially thought about this was he had all the, actually print issues of a scientific journal in his home. I guess like in piles against one of the walls in his apartment, and he noticed that the heights of these piles actually conformed to a nice mathematical curve, and then it began his kind of thinking about this sort of thing.

Nico Perrino: So, that's the growth of knowledge and the growth of knowledge follows a predictable curve you argue, but it's different for each field of knowledge. For example, opera might look the curve and studying opera might look different than the curve in say, studying biology or a disease for example. Right?

Samuel Arbesman: Yeah, and I think that's kind of the exciting thing is that different field of knowledge because of the differences in how those domains work and the fundamental properties of these fields, they're going to actually have some differences, which means you can begin to think about what you should expect when you deal with different fields. Like, for example, in medicine, medicine – there's a lot of change, especially in terms of what is current standards of practice versus thing that are obsolete or no longer true.

And so, physicians – I guess medical students when they're being trained to be physicians, are told, especially in medical school, a good fraction of what you learn is going to be overturned within a few years of graduation, and so you really need to be thinking very consciously about the idea of this concept of the half life facts and the fact that knowledge is going to be changing and you really need to very explicitly try to update your information because lives are on the line. So, it's really important.

Nico Perrino: And you argue that almost every fact that we've ever known has been subject to revision or reversal. Correct? For the most part.

Samuel Arbesman: I wouldn't quite say it that strongly, but I would definitely say that there are many instances where things that we thought were true have been overturned. And so, like for example, my grandfather, when he was in dental school in the '30s, I believe, he actually learned the wrong number of human chromosomes. He actually learned that there were 48 human chromosomes instead of 46. Because it turns when there weren't as good imaging techniques, someone had measured and counted, and I guess in this case, miscounted.

And this bit of information propagated and actually made it into the textbooks and was propagated for, I guess, several decades until in the mid-1950s there was a better imaging technique developed and people decided to recount and realized that there were actually two fewer than people thought.

At the same time though, I think when – when you think – when you talk about how knowledge gets overturned, it's not that everything we thought was true is now wrong, and so we're kind of just flipping between states of certainty and suddenly we're ignorant again and we're worried about it; there's this constant sort of asymptotic approach to the truth.

So, there's this great quote from Isaac Asimov where he's talking – he's corresponding with someone about whether or not the earth is flat or not, and he's saying that – he's says, "When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong." Because it turns out there's actually this oblate theory. And Asimov continued saying, "But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together."

And so, there's this idea that we are getting closer and closer to the truth, and so, things might be overturned, but it's in the service of actually getting closer and closer to some sort of better understanding of the world around us. And so, -- and part of that you see; like when Newtonian mechanics was overturned by Einsteinium physics, it wasn't that suddenly Newtonian mechanics was thrown out. It's still actually quite useful for a lot of things like building bridges and building buildings because these buildings are not being built at large fractions of the speed of light, so those kind of – the Einsteinium checks on them are not necessarily that relevant.

But we are kind of getting closer and closer to a better understanding. So, I would say that's probably the caveat to better understand how we think about overturning knowledge.

Nico Perrino: A greater conception of the truth, in other words.

Samuel Arbesman: Right.

Nico Perrino: So, why is this important? I have – have some theories as to why this is important from the perspective of a free speech activist, but why did you want to write this book? Why did you think writing about the half life effects or how we gain knowledge and how it's subject to revision or reversal was important?

Samuel Arbesman: So, one of the reasons it's important is because at one level everyone of us intuitively knows to a certain degree that things we might have learned when we were young or things that were in our textbooks when we were students, are no longer true. The things have actually been overturned.

But at the same time though, it's not just this pattern of complete flux and constant change where, "Oh, like because I read that food X is unhealthy and now, I read that it actually is healthy, therefore I can't know anything." There are actually rules and regularities beneath all this flux to how knowledge actually changes. And so, I wanted to really try to understand this. And I think – the other thing I was also very interested in is that is to get people to realize that – and when people think about science, they think of it like this body of knowledge or body of facts. And science really isn't that. It's really a – it's a means of querying the world around us in order to understand it better.

And I think if we go in – if we understand science in terms of that, then we're going to be a lot – a lot more positively predisposed to thinking about how knowledge is changing over time. And I also wrote this because I wanted people to better understand the scientific process of how people think about science. And one of the reasons – and there's – when we think about science there's often like – there's like the core of science like the body of knowledge that makes sense.

The textbook that maybe is not changing quite as much and really is reasonably well known. Then there's the frontier where there's a lot of flux. There are lots of things changing. This is the things we read about in the newspapers and like – the science section where, "Oh, like this thing we thought was true is now wrong, or this bit of nutrition information has now been overturned," and that's where we know the least but in truth that's where the most exciting things are happening.

This is the reason scientists work in a frontier is because they really want to be actually overturning and better understanding the world around them. And so, we need to really make an effort to better understand how knowledge grows and changes. And in order to actually kind of embrace this idea of changing knowledge and science as a means of inquiry as opposed to just a body of knowledge.

There's another – my – a professor of mine in graduate school, he – he told me this story of – this is right after I had graduated – he told me that he – one day he – I think it was like on a Tuesday – he came in and taught a course on some specific topic. And the next day he actually read a paper that invalidated the lecture from the day before. So, then he came in Thursday and said to his students, "Remember what I told you on Tuesday? It's all wrong. And if that bothers you, you need to get out of science."

And so, I think there's this – almost this sense of delight and wonder of yeah, these things are changing but this is like – when that's changing then science is being done right. We're actually learning more and more about the world. So, really, I wrote this to get people to better understand the scientific process, engage with science as sort of this means and inquiry and really delight in the fact that we're constantly learning more and more about the world.

Nico Perrino: Yeah. One of the big questions I have after reading your book, is more or less how we better embrace this change and knowledge and whether there's certain pos – you know, possible public policy implications from your thesis about knowledge. At least – as we've talked about. You posit that knowledge creation and deterioration can more or less be predicted, but does that theory that knowledge creation and deterioration can be predicted hold regardless of environment?

Like, for example, is it the same in China as it might be in the United States? I was told, for example, that a lot of the basic code that the US web developers use as building blocks for bigger projects can be retrieved by the internet searches, it makes their jobs a lot easier, but in China that code is often hidden in websites that are censored, so the coders have to develop that code from scratch making the knowledge production process, one might say, a little bit more difficult and perhaps slower.

So, I'm wondering what you think of the public policy implications of this and whether the environment, if it's held constant, has any sort of effect on it?

Samuel Arbesman: Yeah, so I – I guess – I'm not sure I've thought about specific policy implications as much, but I would say – going and building on what I was saying before, like the hallmarks of science like it's not – it's not this body of knowledge. It's kind of a set of ways of querying the world, and I think one of the ways is that science is cumulative. Like, you build on what came before you. Like you cite – you cite papers, you try to replicate studies, you look at data that people have done, maybe analyze it differently.

And I think when – when innovation is occurring behind closed doors or it's – or it's kind of the output or the product of science, is not as public, then people are not going to participate as much in this process. In kind of like the – science when done right is this kind of this open means of querying the world where you test things, you try things out, you – it's cumulative.

You build on what has come before you, and I think – and so, you really want to make sure that – that in order to make sure that science is as cumulative as possible, you want to provide the right kind of incentives for people to get credit for their work, organize priority, be part of a community, share information, and really kind of make sure – all in service of making sure that our body of knowledge isn't constantly improving and being shared and being as useful as possible.

So, I would say when there are pol – like policies that allow this sort of open exchange of information and testing, like those are the kind of things we want. And so, yeah, I would say those are the kind of policies that we want in place.

Nico Perrino: Yeah, you sort of of want it an environment that encourages open inquiry and that's my main interest in this topic because we also – we often argue in free speech or academic freedom or open inquiry circles, that one of the reasons you want that is not just because it helps the democratic process for example, or that it is a boon to individual liberty and autonomy, but also it helps produce new knowledge, and as a result, gives us all the cool tech that we like, it helps us live longer. It makes our lives all around more fulfilling when we learn more about our environment.

But, in reading your book it seemed as though there is almost sort of a determinist outlook at how knowledge marches forward, and I was just curious whether it marches forward regardless of our environment. And you know, whether science and knowledge doesn't have all that much to say about open inquiry and people will find their way to new knowledge regardless of their circumstances.

Samuel Arbesman: I would say – I don't think my argument is kind of like that deterministic. I think it's predicated on certain forces and the properties of the – like scientific inquiry actually being present. So, for example, if you can't publish your work or cite things in certain ways, then not only would science, from my perspective, not necessarily proceed as quickly, but in many ways you cannot even measure some of these kinds of things because a lot of the metrics that I'm talking about are based on understanding citation, and understanding how people exchange information.

So, I do think it's not as – like these things are not like the gravitational force where it's just like you set things in motion and then suddenly everything is going to be happening. These are not laws of nature. These are dependent on a good scientific climate and related to this you can see that – in going back to public policy things you were talking about, there are – the United States government has put in place certain things about making sure that people are sharing information within a certain amount of time after publication of the research.

I don't remember the exact details and I'm not – it always kind of depends on how people – the details clearly matter to make sure that people can not only just see the raw data or see kind of some sort of scientific models that are being used, but really being able to reproduce things and actually test things because I think part of making sure that science marches forward is really having this sort of self-correcting feature within science and part of that self-correcting feature is weeding out things that are not actually correct, that are flawed or don’t work in the way people expect or have a huge number of caveats in terms of their applicability.

So, I think really making sure that you have this openness and shareability is very important. And really, it's that, I mean people – and there are many people who are thinking about this very broadly. So, for example, and like one of the classic things that people always talk about and we have – we're in the 21st century. We have these technological tools, and yet, at least when it comes to the dissemination of our research, in many ways we are stuck in the mid-17th century.

So, like 1665 is the publication of – that is like – the first two scientific journals I think were 1665, and even though in many ways science has progressed quite a ways from the 1660s, we still use scientific journals and scientific articles as the means of disseminating our research, even though there's many different ways to rethink this kind of thing. And there are people rethinking how and what scientific journals should look like, how we should be disseminating this.

Also recognizing that now a huge amount of scientific research is not just traditionally a bench research. It's also incredibly computational and so, really being able to share the computational construct that you have for however you collected the data, examined it, ran some simulations, whatever it is, being able to put it some sort of format or package that people can then easily run on their own machines, I think is actually really really important.

And so, people are trying to think more broadly about how to have reproducibility of computational research, as well, as increasingly all of science becomes computational. So, I think there are a lot of people thinking about this kind of thing, but yeah, I think we still have a ways to go.

Nico Perrino: Yeah, I'd just to see, for example, what the knowledge curves look like in all these various domains. For example, in North Korea. If we could study in North Korea, and is the curve as pronounced as it is in, for example, a lot of the open global western and eastern countries. I mean, knowledge in many ways is a global enterprise at this point with our increased communications technologies and North Korea is really the only country, the hermit country, where you might be able to test an alternative, an alternative model, but…

Samuel Arbesman: Yeah, and to be honest, I'm not even sure that looking at any curves of like, oh the growth of scientific – like scientific research within North Korea. I'm not even sure that would even make like – it would be meaningful if it would just be – and maybe there are papers there. To be honest, I really have no idea, but I don't know if those papers are good if they're just citing each other. In some ways it's just kind of this weird little environment of this strange North Korean science. I have no idea what's going on there, and I don't really even know if it's science the rest of the world might understand it.

Yeah, so I really think – and part of – the exciting thing about science is that it really is, for the most part, transnational international, and – and I was not part – I was not doing research during the cold war, but my sense is, at least from what I've read, is that that was one of the few ways in which the Soviet Union and the United States worked together was actually through the scientific communities of those two countries is that they actually were trying to share information as best they could.

And, of course, and there were many times they were not, but I think science when done right, really has this power to kind of just like make information transfer as open as possible, in the service of learning as much as about the world as we can.

Nico Perrino: Can we predict when challenges to new knowledge might happen? And when I say challenges, I mean like outside the scientific community. Namely, from outside actors. Sometimes it might be government other times it might be interest groups. I'm thinking here about some of the climate change research that's come out in recent years. Maybe the stem cell battles of the '90s and 2000s, the Copernican Revolution.

It seems as though the challenges come when they implicate some sort of culture war battle. And you also talk about how knowledge – it's harder to discover after you've taken care of these easier to discover problems. I don't know if I'm phrasing that correctly, but as things become more complex, I'm thinking that they also might become more abstract and confusing to a lay audience, and then also, less likely to be challenged, either through censorship or through, I don't know, regulation.

Samuel Arbesman: So, maybe I'm not sure I understand the question. So, the idea is when – when do people kind of re-think certain scientific dogmas. When – because like, certainly, and you want to have a healthy amount of skepticism is, I think reasonable and good, I think though being a contrarian for contrarianism's sake or just kind of like sceptic, sort of like a knee-jerk sceptic, I think that's not necessarily productive.

So, are you asking more about whether – like when sometimes there's some sort of cultural movements to begin questioning certain areas of science or when it's like – when within science there are certain – like certain scientists saying, "Let's actually re-examine certain scientific things that we think are true," that maybe are not and because we have certain weird exceptions or bits of data that don't make sense.

Nico Perrino: Well, I guess I'm – this is kind of piggybacking on my earlier question about the public policy implications; just whether it's even possible to predict when a piece of knowledge will be challenged; not for its voracity, but more or less for some of its downstream implications.

I'm thinking about stem cell research here; I'm thinking about the argument that the sun is the center of our universe; even maybe you can apply this sort of line of thinking to the CRISPR dialogs that we're having right now, and how concerns about what CRISPR might result in might impede what would otherwise be just the general march in the direction of implementing some of those technologies.

Samuel Arbesman: Okay, and I guess – I'm gonna speculate here, but perhaps one of them is just simply a – when there's a greater distance between sort of the scientific community and the general population, and so there's less of an understanding or maybe there's a breakdown in the ability to explain certain scientific ideas to a general audience. But I'm not entirely certain that those are correlated with increasing complexity of a specific subject because I think actually – I do believe that when engagement with the public is done right, even pretty complex and esoteric ideas, are able to be explained to a general audience.

And so, perhaps maybe it's a less of a willingness to try to explain these kinds of things. I'm – or maybe just a simple inability to do that. I don't know, again this is very speculative, but I wonder if it's kind of some sort of like when there's an increasing gap, then into that gap comes a certain amount of skepticism or concern even though it really doesn't make sense within the scientific community it's because there's this gap in language or in discussion. But I'm not sure.

Nico Perrino: Yeah, how important then is physical space, and perhaps something as basic as the number of people in a population in knowledge generation. I know you can say, you can predict more or less how innovative a community is by its population, but there has to be more than that, if I'm not mistaken.

For example, you don't get anywhere near the innovation of say, Silicon Valley in a place like San Paolo, Brazil where you have 12 million people despite the larger population. And I have always been curious why Paris, for example in the 19th century and Florence in the 15th and 16th centuries produced so much more knowledge than some of the larger population centers elsewhere in the world. So, what does population and physical space have to say here?

Samuel Arbesman: Yeah, and so I think population at a very rough level – population is a very good first order of proximation, or even like zero order proximation; it's a rough proxy for people able – being able to interact with each other and having this sort of combinatorial interaction of ideas. And so people – and one of the things I discuss in my book is that – and Jeffery West, the physicist, he's actually explored this in far greater detail – and he and his collaborators and some of the work that I cite, is this idea that if you look at the population of a city, you can actually look at how as the size of the city scales, the productive output also scales.

In this case it scales super linear, so the idea would be that if you double the population size of the city you actually more than double the productive output, like the number of paths, or whatever it is, and the idea behind this is – so, there's this nice little curve, but of course, the curve like when you plot a whole bunch of cities on the curve, it's noisy. They don't fit exactly. There's gonna be a certain amount of noise.

And then within that, like there are certain cities like San Jose, or kind of a San Francisco Metro that over perform on that curve and there are ones that under perform. So, I think yeah, you're right, there's kind of this overall very rough draft of that population matters, but then you can say, "Okay, then what are the details? Like how do we better understand why certain cities – like certain regions are – are more effective than others at generating ideas or producing knowledge?"

And then there actually – then there are a number of studies that people have looked at that are more fine grained, and I think that actually also shows that on a finer grain nature that distance and location also matters. There is a paper that came out, I don't know, 10 or 15 years ago or so, that looked at collaboration within – within actually downs the level of within a building.

I think it was like looking with a MacPhail Medi-Hole research and it was looking at collaboration patterns at a university, and then they measured the distance between these collaborators and I think, if I recall correctly, looked at the number of citations that their research received, as a rough proxy for how impactful that research is. The idea is that if more scientists are studying your research, then it is more important in sort of very general way.

And the idea was that they found that actually that distance really mattered. The closer you were the more highly cited – the closer – sorry, the closer the collaborators were the more highly cited their research was. And so, this idea that if you are closer – it's kinda like the water cooler effect that people always talk about. A huge – if you have the ability to really interact, then you're just exchanging ideas and there's kind of something in the air that where people can kinda piggyback on various things and there is this sorta combinatorial effect.

In truth, I don't really know if people fully understand in some sort of generic mathematical way why Paris in 19th century or like the golden age of Greece or ancient Greece or whatever they – why these things were so – or like Edinborough in the 18th century or whenever it was – was like so impactful, kind of like out of line of like its population size or whatever, and I think then that you really dig into the history and look at the details.

And I think ideally from those details you can maybe then try to create a more general theory of understanding of why – how population matters, how distance matters, how certain types of ideas and certain types of maybe, size of university or the ratio of people – like people at a university versus the overall [inaudible] [00:28:16] population within that metropolitan area. How those things matter. Again, I don't know which of those are correct, but I think you need a combination of sort of these generic theories as well as looking at the historical details.

Nico Perrino: Yeah, it's just another sort of interesting point for me when you're looking at possible public policy implications for people who want to supercharge the knowledge generation process or even as a manager at an organization myself, how I can help my department become more creative. Maybe working from home or having multiple employees work from home isn't the greatest idea if proximity is a generator of creativity and knowledge. I don't know. It's just – sort of an interesting thing

Samuel Arbesman:: Yeah, and at the same time though I – I would just be mindful of the fact that some of these kinds of – certain effect sizes might be small, so it's the kind of thing where if you're able to do things pretty well, and there's a huge amount of costs to reorganizing your entire organization or – these kinds of – there are tradeoffs to be just kind of aware of. And there are also a lot of path dependents.

So, like the one – there's a huge number of reasons why Silicon Valley is Silicon Valley and many of those things are not repeatable in terms of like why a certain group of people moved from the East Coast to the West Coast, started their company and then people fled that company and went to other things. And then there are defense contracts and they've got all these kinds of things really led to why Silicon Valley is Silicon Valley.

And some of those kinds of things are very difficult to really repeat. And some – then again, there are other things that probably actually do have greater policy implications. So, people have actually studied whether or not the non-enforcement of non-competes is actually a really important factor for innovations. The idea is that having a non-compete – like when you move from one company to another you can't compete – you can't compete with your previous employer for a certain number of years or whatever it is.

California has – does not allow enforcement of these non-competes, but other states do. And so, the question is maybe – the question was maybe this non-enforcement of non-competes actually is important for innovation and I believe people – there were some economists that used this great natural experiment, I forget the exact details, but I think it might have been in the 1980s Michigan, I think, went from not enforcing non-competes to enforcing non-competes and there was – I forget which direction it was, but from that you were able to note it – I think it was going from non-enforcement to enforcement.

And they noticed that right around the time the enforcement happened, there was actually this flight of innovative individuals from the state, and so there are certain policy things that can be done that are fairly easy to do, but I think other things like that kind of path dependents of history it's very hard to replicate.

Nico Perrino: Well, it's interesting when you talk about non-competes. You can kind of see the same sort of innovation and creativity spawned around a more liberal fair-use doctrine. For example, if – if you don't have to worry so much about someone coming after you for copyright infringement when you do some artistic adjustments to a say, a zombie movie or to a song, you get more people playing around and having fun with various versions of a song or a movie, and some of that has generated new ideas.

I think there was actually, forget what zombie movie it was, but the zombie genre more or less arose out of one of those early zombie movies failing to put a copyright notice on the movie, which allowed, and I'm gonna go back and look at what movie it was so that I can put it here in the show notes, but they failed to put a copyright notice on it and as a result a lot of people used its more or less its intellectual property and any other context to spawn the zombie movie genre.

So, it's interesting –

Samuel Arbesman: That's fascinating.

Nico Perrino: When you allow that sort of creativity to spawn around and you don't build large walls around any individual enterprise, what can happen as a result. I wanted to ask you about citations and journal articles because one of the big data points that you use when you're analyzing – not just you but anyone involved in sort of examining this space, one of the data points you use are these journal articles and citations for them.

But you say in your book that something like only 20 percent of journal articles cited were actually read by the authors, which leads me to believe that the citation of journal articles really doesn't have much to say about the voracity of what's actually found in those articles just perhaps that some influential previous scholar had read it, cited it and then someone else thought it was probably a good idea to cite it in their work, as well.

Samuel Arbesman: Yeah, I'm certainly – so I don't remember the exact details of how people determine that I think only 20 percent or actually – it might have actually been through maybe errors in citation where there were typos and things like that, I forget exactly the detail, but the – and certainly citations are a proxy for a whole bunch of different things. You might cite a paper if it's influential, if it related to what you're doing you might cite a paper only to try to actually refute it or to disagree with it.

So, it's certainly a very rough proxy and so, certainly – and I would say a much better gold standard for in terms of thinking about a half like of knowledge over time would be something like actually surveying experts in a field and saying, "Okay, which of these papers are still true, which ones have been overturned or rendered obsolete," and so like – and there are people who have actually done that kind of thing. So, I think those are much better means of understanding how knowledge is thought about.

And because at the same time though, and there are certain things that are incredibly foundational for science. So, for example, like Isaac Newton's Principia. I imagine that is not being cited with nearly the frequency that many people might expect, given its foundational status within physics. At the same time though, and that's not surprising, there's been a lot of things since then, and I'm pretty sure it was written in Latin, and people aren't' reading that.

And so, I think there are different ways of getting – understanding scientific change and measuring certain scientific things, and I think people are getting better at finding richer set of data of how we think about this kind of thing. But certainly – and you can also see that as citations matter more for things like tenure decisions, then people – then that skews the incentives for how people think about getting citations.

And so not only do you – and you have issues – but it's like authorship work. There are many – and many situations where one of the authors in a paper really hasn't contributed that much but they're still on the paper for other cultural or sociological reasons, or political reasons.

In the same way citations are done for lots of different of ways – for lots of different reasons, and I think we just need to kind of – when we look at authorship, authorship order which actually met – that varies from – in terms of the way people do authorship order, that varies from domain to domain. Citation, certain types of journals allow more citation – there's like space constraints and if there's less – fewer space constraints that might actually change how you cite things, so I think…

Nico Perrino: Yeah, in your book you talk how the more successful a scholar they are, the more likely that their name is listed later in the authorship.

Samuel Arbesman: Right, but that also and yeah, so I would say, it's more that – and if you are the head of a lab, at least in kind of like biological sciences, then you're sort of like the last, the last author. So, like the first author and the last author are often the most important, and then it just goes in from there to the very middle one which is the least important. But I think in economics there are many cases where it's simply just alphabetical order.

And so, if you were coming from a field that was not familiar with that and you looked at an economics paper, you might have a very different sense of who's important who's not. And so, I think recognizing the differences as well as the limitations on how people think about these kinds of things, both citation, authorship order, all the other kind of factors is really important.

And people are trying to address some of these kinds of things, so I think now that there are a number of journals where when you have – when you publish a paper you also need to include what sort of tasks and what parts of the paper each author was responsible for to kind of really make sure that the readers have a sense of what when – like who contributed what and kind of who was involved in what sort of thing, and what went into the actual – actual output of the paper. So, I think there are some good changes that are happening which is exciting to see.

Nico Perrino: Going to back to what I was telling you before about the zombie movies, I just pulled it up on my browser here, and the movie is The Night of the Living Dead, which was originally called Night of the Flesh Eaters, but then when they were going to change it to the more popular title that we know of today, The Night of the Living Dead, they changed the title card and forgot to put the copyright notice on that card.

Samuel Arbesman: Interesting.

Nico Perrino: Yeah, and though it – and I'm reading an article about it today, they say that though it would not be a large issue today in 1968 that meant the movie was not protected by copyright and instead was placed immediately into the public domain, and as a result people were creating unofficial sequels to it that spawned more or less the zombie genre that we know today. Previous to that, it was more – zombies were more thought of to be like voodoo zombies, people who had spells cast on them rather than sort of disease-ridden zombies.

Just an interesting sort of quirk of how a –

Samuel Arbesman: That's fascinating. And this also reminds me of when you look at the space of things that are in public domain, it many cases it's not always easy to actually know if something is in the public domain or not, and that's a problem. Because if you want to build on knowledge and information or just any creative outputs and products, you can't because if there's a concern that it might be in the public domain and it might not be, then I imagine in many cases people just err on the side of, "I won't deal with that and we'll just try to choose something else, or just not even make something."

Nico Perrino: Well, isn't that what coders do?

Samuel Arbesman: Yeah, and you would love to have – you'd love to make it easier, so people can actually build on that. You still want people to get credit and be able to kind of be like – remunerated for their work, but at the same time though you still – the whole point of having copyright is to encourage people to do things but at the same time making sure that it's time limited to also then allow it to enter the public domain and be sort of this body of cultural knowledge that people can build on.

Nico Perrino: Didn't early coders and web developers sort of implicitly or perhaps explicitly understand this when they decided that they were gonna – one of their ethos was gonna be this open source framework where someone creates code. They were just kinda put it on the internet for other people to build off of?

Samuel Arbesman: I mean there was – certainly in some groups of programmers were into this kind of thing and certainly in one of the major innovations of Microsoft was actually recognizing that software has value and therefore they should be paid for it. But at the same time though then there was this other group saying that and there's open source movement and we want to kind of – and they create this sort of legal structure to insure that these things are not copyrighted or they have sort of limited rights and to allow people to share things and sometimes actually force other people who use them to actually make sure that the things based on it are, in turn, also shareable.

And so, there's lot of different paths. For the way I view it is, you really want to make sure that there are as many paths as possible because you want people to have options. And certainly, open source has been enormously beneficial, and, in many ways, science is kind of like the apotheosis of this kind of open source idea where it's all about – and because some of it occurs in the university setting, people are not getting paid for their research papers.

In many cases you often, especially papers that are publicly available, sometimes you – sometimes university or research actually has to pay to get the paper published even after its been peer reviewed. But the idea is that because the incentives – the university academic incentives are for credit and priority are separate from ownership and copyright, it insures that scientific information and scientific knowledge is going to be as – hopefully widely disseminated as possible.

Now, course, there's the whole scientific publishing industry which has made a lot of this a lot more difficult because there are many many things that are behind paywalls, and there are many people who have great concerns about this because these things are behind paywalls, even though a lot of it is government funded and so therefore, the idea is if it's government funded and I paid for it with my own taxpayer money, I should have access to research about some genetic disorder that runs in my family or whatever it is. I – this should be a public good.

And so, there are many people who are trying make sure that this information is publically available, kind of more open access.

Nico Perrino: In your book you talk about social networks a bit, and I'm interested in exploring social networks and how they help spread knowledge. Obviously, the more open channels of communication the more easily or the easier it is for knowledge to be disseminated, but you also talk about how the more open the channels of communication the more – the easier it is for falsity to be spread, as well.

And you book, in some ways was prescient here when we talk about fake news today for example. The – as Benjamin Franklin put it, "A lie gets halfway around the world before truth has a chance to put its pants on," and I think about this in the free speech context as well that there's that first mover advantage and the Voltaire quote or the alleged Voltaire quote, the, "I might disapprove of what you say, but I'll defend to the death your right to say it," isn't actually a Voltaire quote; it's actually a quote from one of his biographers, Evelyn Beatrice Hall, but it was put out there, it was catchy, people liked it and so now it's ascribed to Voltaire and people don't even think twice about it.

So, I wanted to – I wanted to hear a little bit about how your thinking of social networks has developed in this age of when we're talking about fake news and how social media might amplify it.

Samuel Arbesman: Yeah, and certainly in errors, it's hard to root out within a body of knowledge and it can spread easily. Like, you can say retracted papers, they're still – they can still be referenced years after they've been retracted, and we might need better mechanisms like error correcting within science, and I think part of that is also just the idea that even though we are in – we think of – we think of ourselves as this hugely interconnected world where you can search for any bit of information at any moment, you can connect with any person you want, a lot of information is still really siloed.

And I think one of the barriers is this idea of almost like there's jargon barriers are simply just – there's a huge amount of knowledge, and so one of the things I think about with this, and this is not quite going to some of the social media stuff, but the idea is when you have a huge number of non-overlapping scientific fields, you end up even though you think, okay information can be shared very equally, very widely, you still get the rediscovery of scientific ideas over and over in different fields.

Like sometimes years or decades later or it might be a different field might rediscover something that has been well known in some other field because they weren't talking to each other. And so, there's actually this – there was a paper – and similar to this – there's also this idea that because fields are not able to talk to each other, there is just a lot of potential new knowledge that's lying on the table.

There's this idea that was developed in the mid-1980s by this information scientist Don Swanson, known as undiscovered public knowledge. This idea that he kind of – he spins this thought experiment saying, okay, imagine somewhere in the scientific literature there's a paper that says, A implies B. And then somewhere else in the scientific literature, maybe in the same subfield maybe somewhere else, there's another paper that says B implies C but because of the vast scientific literature no person has actually read all the papers that are out there and realized that maybe we should connect these two papers and maybe in fact, because of these two papers A implies C.

And so, there's this undiscovered knowledge that's out there and the cool thing is that actually Swanson was not just content leaving this as a thought experiment he then used the cutting edge technology at the time which I think was using search terms on a MEDLINE database, but he actually found, I believe, a relationship between consuming fish oil and helping treat some sort of circulatory disorder, and I think he ended up even publishing it with a medical journal even though he didn’t really have medical expertise.

And he showed that there's just a huge amount of knowledge that's out there and things that could be learned even more but were not because our social networks are such that they're often more vulcanized in many ways than we might think. And so, in the same way you were talking about the error can persist for a long time, we're also just missing lots of information where people are rediscovering or reinventing new ideas in one domain after another because they're not talking to each other.

There's information that we have not learned and new knowledge we haven't learned because these – because various fields are not talking to each other. And so, I think one of the really important things, at least from a scientific perspective, is having people who can interact across different domains, like more interdisciplinary thinking which a lot of people talk about.

So, there's this concept of what people refer to is the T-shaped individual. The idea of like you have – you have sort of the vertical part of a T sort of like your depth of knowledge within a certain domain, and then you have the horizontal part of the T; your ability to interact across different domains.

And I think more and more we need lots people like that because otherwise, not only are we gonna have just errors persisting in areas and fake news without people being able to say from a domain, maybe like one step over, saying, "Guess what? That actually has been overturned and it's been wrong for a while," or, "We actually knew about this for quite some time," but we can also then learn things a lot better actually create new knowledge by just connecting different domains together.

Nico Perrino: Just as kind of an aside, what do you make of the "replication crisis" that's out – the idea being that there are certain studies that are put out. They might capture a ton of media attention as a result of their findings, but they're never replicated which is an important part of the scientific process to insure that the original finding wasn't just some sort of fluke.

Nico Perrino: So, it's definitely a problem. It's certainly a concern, and the one thing I – so in the –

Samuel Arbesman: Do you think it's a crisis though?

Samuel Arbesman: Well, crisis is a little strong – maybe it's a little strong, but I do think – so, in my book when I wrote about this one of the things I was concerned about is that I mentioned that the reason it's such a problem is because people are not, like they don't gain prestige in the same way just by replicating someone else's work as they do by doing something new. And I thought because of that there's going to continue being this sort of problem, and the one thing I've actually been gratified to see since the publication of my book, not related to me but the fact that – to a certain degree I was wrong about this.

There are actually now projects trying to replicate lots and lots of research and in many cases showing that a lot of these things are not replicable. So, I think it's not as much of a crisis it's the fact that we need better mechanisms for incentivizing and rewarding people who actually try to replicate science and make sure it is this replicable, repeatable body of knowledge because otherwise, yeah there will be – if people are only trying to do new things as opposed to actually trying to make sure that what has been published is correct, we're to end up with – yeah, we're gonna end up with a lot of spurious results.

And so, the one thing I have seen over the past few years is that because of heightened awareness and people talking about this kind of problem, we are now creating new projects in new ways of trying to make this – make things more replicable, whether or not it's making it easier to share the underlying computational model for when you do your analysis, or allowing people to be rewarded when they try to actually try to replicate things and things are not replicable.

So, it's certainly – a crisis – I guess crisis is not too alarmist I would say. It's certainly a problem – the one nice thing – I would say the one sliver of hope is that it seems as if because of the heightened awareness of this, we are now having mechanisms to incentivize people to actually try to make sure that science is replicable as possible. And so hopefully, in the future, we won't have as many of these problems.

Nico Perrino: I want to close up by asking you how much of the knowledge in your book, or the studies that you cite have reached their half life or become irrelevant or have been overturned. Your book came out in 2012, we're going on six years since its publication. What else have you learned and what have you found to be wrong in your original work?

Samuel Arbesman: Sure. And well, certainly on the one thing what I said were I felt like the replication crisis – people would not be – would have no incentives for replicating things, turned out to be wrong. And I feel very glad that I was wrong because that means science is doing the right thing. One other thing, I certainly don't know a percentage of which things have been overturned, but they can definitely tell a story.

So, actually it turns out – so, one of the stories in the hard cover edition of my book is about the – an error – a typographical error related to the iron content of spinach and how it persisted for decades. And it turns out between the time my book went to press and when it was actually published, I learned that actually this story was not actually fully true.

So, it turned out – what happened was – and so when I wrote this story I quoted this – I quoted this story within a British medical journal that was, I think the article was from like 1980 or so about this kind of typo that led people to think that iron had – or that spinach had ten times more iron than it actually did and because of that we have Popeye and all of these kinds of things, and it turns out this is probably not – not correct.

It was probably, there was some – there was some maybe some experimental contamination issues or confusion between iron oxide and iron, but this error in the late 19th century or whenever it was, it was actually corrected relatively, like in the span of a couple of decades or so, I think the early 20th century as opposed to all the way into the 1930s, but this story – this story of how error persists, the story itself – this erroneous story was only debunked in I think 2010, and I actually was not made aware of it until mid-2012, whenever my book had gone to press.

And so, I – so then I was able to mention that then also include that in an afterward, the whole story – the story in the afterward of my paperback edition. So, that was kind of exciting. Another thing – and this was not necessary from the book, but I remember one of – an essay that I wrote that related to some of the idea – and I say related to some of the ideas of the book before I published it, I used this well-worn analogy of people always talking about when things change slowly it's sort of like a frog being boiled in water.

And if you kind of like – a frog will jump out of boiling water but if you slowly raise the temperature, the frog will be used to it and then it will be boiled alive. And it turns out, this is not true and James Fallows, a writer at the Atlantic, he actually has devoted a lot of effort to root out this error, and it turns out – I think people have actually tried this and it's only true if the frog is brain dead or something like that, which reduces the effectiveness of the story quite a bit.

And so, I was able to actually include that within the book – saying that actually, "Guess what, the story is not actually accurate," but I'm sure, yeah, there are things as well, too.

Nico Perrino: All right. Well, I think we're gonna leave it there. We've been talking for close to an hour how. Sam, I really appreciate your coming on the show.

Samuel Arbesman: It's my pleasure. Thanks so much.