Thursday, August 07, 2014

The filter problem

Drugmonkey (@drugmonkeyblog) doesn't think there's a filter problem [There is no "filter problem" in science].

He writes,

It is your job as a scientist to read the literature, keep abreast of findings of interest and integrate this knowledge with your own work.

We have amazing tools for doing so that were not available in times past, everything gets fantastically better all the time.

If you are a PI you even have minions to help you! And colleagues! And manuscripts and grants to review which catch you up.

So I ask you, people who spout off about the "filter" problem.....

What IS the nature of this problem? How does it affect your working day?
I'm trying to keep up with a number of very broad and diverse fields. For example, as a textbook author, I need to keep abreast of just about everything that might be covered in an introductory biochemistry course. I'm also trying to keep informed about evolutionary biology; especially molecular evolution because I teach a course on that topic and I blog about it. I don't want to miss exciting developments in pedagogy (teaching) and the philosophy of science. Finally, I like to be up-to-date on the latest advances in other disciplines.

Here's the problem. There's a lot of junk out there. It's a waste of time to scan all of the science journals that might possibly have something of interest to me and it's a waste of time to get any "tools" to do it for me. Most of the time I wouldn't even know what to ask for. For example, I don't want to see all the papers on photosynthesis but I need to see the one that's going to change my textbook. I don't want to see all the papers on mutation rates but I do want to see the ones that are worth blogging about.

There were times when I could sit down for a few hours every week and scan the tables of contents of the leading journals in my field. Those days are long gone and my "field" has expanded enormously. I need to filter but I'm pretty sure I'm missing some important papers. In fact, I know this because just about every month I hear from others about things that I've missed months, or even years, ago.

I have a filter problem. I'm filtering out some important things and reading far too much junk. My filter problem can't be solved. If it wasn't for blogs, I'd be in bigger trouble.

We're in the middle of a discussion about the function wars. It's obvious to me that members of the ENCODE Consortium also have a filter problem. They've filtered out all kinds of information about the organization of the human genome. They don't understand the evidence for junk DNA, for example, and they don't have a good grasp of evolution. On the other hand, they've probably read every recent paper on the methodology of RNA-Seq, ChIP, and data analysis algorithms.

I'm glad that drugmonkey doesn't have a filter problem. Or, should I say, I'm glad that he THINKS he doesn't have a filter problem. It must be comforting to believe that he's keeping abreast of everything relating to his interests. I've never felt like that.


  1. There is definitely a filtering problem. I am feeling it very hard myself.

    It is actually not at all easy to both write and read papers these days.

    There is an old Russian joke about the chukcha who applied for membership in the Union of Soviet Writers where they asked him which of the classics of Russian literature he has read, and he replied he hasn't read any of them. Then they asked him on what grounds he should be admitted to Union given this lack of such essential knowledge and experience, to which he proudly said "Chukcha writer, not a reader!".

    Given the direction of the pressures and incentives that scientists are working under, this is in fact becoming very relevant to our situation :(

  2. It's just a case of allocating however much effort necessary in my opinion... If as you say it's not something you can set up email/RSS keywords to watch out for (via Pubmed Saved Searches or Google Scholar Alerts) because your interests aren't so specific, then it is, as in the paper era, a case of just tracking TOCs.

    What @drugmonkey was saying is that there are abundant tools to do so (from alerts to the likes of Feedly RSS reader to Pubchase literature recommendations). I say this reading your blog via Feedly amongst a host of other journals' feeds (100+ easily).

    I think I'm in a similar boat to you in that I want a fairly broad overview of the biochemical/life sciences. One option is to just accept what Twitter or the blogosphere dredges up, but there's the danger of more noise on trendy/hyped topics/research areas (as well as plain old network phenomena of who has most followers etc. will in effect shout the loudest) than signal on high quality or interesting new research.

    The line drawn here is naturally entirely a personal one, and if you really want to find these papers I'd recommend scanning through journal article titles. I don't find it a waste of time, it gives a rewarding sense of agency

  3. I'm the of the last generation of scientists who remembers having to actually go to physical libraries, read physical books of abstracts to find papers of interest, look up the papers in bound journal volumes and then photocopy the papers. It took all day just to find a few papers. When scientific publishing moved to the Web and PDFs in the late 1990s, it changed everything. While the system isn't perfect (absurd journal paywalls that want you to pay $40 for a 20-year old paper and so on), the fact is you can simply go to PubMed, type a few keywords and get all the papers about the subject (generally with full text) in a matter of minutes, while it would have taken days in the past. The past had the *real* filter problem -- it was nearly impossible to find anything -- that's why things like Mendel's results were forgotten and rediscovered.

    1. That is true, however, it is also true that the volume of the literature doubles every 15 years or so, while the volume of all the other things scientists have to deal with (administration, grant writing, paper writing) tends to go in only one direction and that's up. And this is not sustainable.

      I have on multiple occasions heard senior scientists who, given a doubling time of 15 years were there when the literature was an order of magnitude smaller than it is right now, talking about how they would just pick the latest issue of each of the less than 10 journals one needed to read at the time to keep current on pretty much everything and they would read it pretty much cover to cover. And it was actually possible to know most things there were to know about cellular and molecular biology at the time. All of this is unthinkable today.

      And this creates another very much unappreciated problem - there is basically nobody these days who has a truly broad grasp of contemporary biology (forget about other general disciplines), it is all divided and subdivided and subsubdivided, and there is really no other way it could be because there are only so many hours in a day and only so much information the human mind can meaningfully process, analyze and integrate. This is a big problem IMHO.

  4. I wonder if Drugmonkey uses any sort of spam detection software to filter his/her incoming EMAIL ....

  5. If you have a very narrow set of interests and they fall in fields with little going on, you don't have a filter problem.

    If that doesn't describe your field and you use tools to filter, you have a filter problem of another sort. You literally don't know what you're missing. This has been a big problem for a lot of people for a long time (it's hampered theorizing about human evolution, for instance), and it's only getting bigger.