Skip to content

Open Science, Open Access and Open Source

I have been thinking this over for quite a while, and have written this post several times over in my mind. As an undergraduate student I remember admiring scientists and imagining how amazing it must be to have a job where you got to discover new things, think of better solutions to problems facing our society and making the world a better place. As my studies continued I aspired to become one of those researchers, and made the decision to take my studies further and applied to do a PhD.

As a PhD student I enjoyed learning more about materials, and was excited to be working with gold nanoparticles and research into how we might make real devices out of this novel new material in the Nanomaterial Engineering Group. It was exciting, challenging and fascinating using techniques such as X-ray and neutron reflectometry, electron and atomic force microscopy and Langmuir-Blodgett troughs. As I learned more through my work I became frustrated with the quality of the software I used, and had always imagined that "real scientists" had better tools available to them. It became even more frustrating when I realized how bad some of the instrument control software was, and how so many of the file formats could only be used in one or two expensive and hard to use programs that only worked on one or two platforms.

Towards the end of my PhD I decided I would like to take some action. I had been trying to draw and render images of molecular structures, and wanted a way to do simple geometry optimizations for posters, papers and web pages. At first I tried to do some of this using an existing commercial package, but it only worked on Windows and we only had one license for the department. The training provided to me as a researcher in areas such as programming and analysis were disappointing and all too often generic tools such as Word, Powerpoint and Excel were the most viable choice for preparing, analyzing and presenting our work. I began writing more software, but much of it was written from scratch with little guidance. As I searched for a better way I came across some open source libraries and tools.

I found a program run by Google called "Summer of Code" where they offered me the opportunity to "flip bits not burgers". I was extremely lucky to find an idea on KDE's idea page for a molecule editor in Kalzium. I was very excited, and had been using KDE for many years. This was a pivotal moment for me, where my life and career took a twist I never expected into the world of open science - and I have loved every minute of it.

It was through that work that I became involved in the Avogadro project, and later Open Babel and met Geoff who later that year offered me a position in his new research group. This was an exciting opportunity as not only did we share a passion for correlating experimental and computational techniques, Geoff was also very active in open chemistry. After I moved out to Pittsburgh Geoff introduced me to the Blue Obelisk, and I now proudly count myself as one of their un-members. We published an open access paper on the Blue Obelisk five years on last year.

After a two year postdoctoral position with Geoff, who was extremely supportive of my work in open chemistry, I met Bill Hoffman from Kitware. I knew that Kitware developed CMake, but beyond that was not really aware of what they did. It turned out that they were involved in much more than just CMake, with open source tools and frameworks such as VTK, ParaView, ITK, CDash and more. They had been working on open scientific software for over a decade, and they were hiring! They weren't just making applications either, they were tackling the whole problem including development, testing and validation of open-source, cross-platform applications and frameworks.

After accepting a position with Kitware in 2009 one thing I never really appreciated was just how poor access is to publicly funded research. I can no longer access scientific papers I and others wrote, that were funded with tax payer money from both the UK and the US! I think that is terrible, and later realized I had become part of the scholarly poor, Peter wrote a follow up detailing the plight of those of us in industry. There is currently raging debate on open access, and campaigns such as The Cost of Knowledge need our support. The products of publicly funded research should be available to all, whether they are in academia, industry, government or anywhere else.

There are too many black boxes in science today, too much published work that is not available to all or reproduced by others. Mathematics used to be the language of science, but more and more it is computer software that is needed to learn more, and too much of this code is closed, unpublished and poorly shared. Papers must include mathematical proofs, or refer to proofs already published, but it is common to see work published that used closed, proprietary package X to conduct a simulation. This is changing, and Scientific American recently published an article on how "Secret Computer Code Threatens Science". Science also published an article about "Shining Light into Black Boxes", detailing the growing problem of witheld source code preventing meaningful peer review and reproducibility of research.

Michael Nielsen published a book called "Reinventing Discovery" that talks about the value of networked science, and is well worth a read if you have not yet had a chance. The Panton Principles outline the need to make scientific data open, and the Science Code Manifesto calls for openly available code in science. The core goals of the Blue Obelisk are open data, open standards and open source. I think for science to progress we must embrace openness, and sharing and resist the urge to hoard data building up small empires on proprietary code and data.

One thing I hope to see come from all of the controversy of the Research Works Act is a clarification that publicly funded research should be available to all, whether you think they will understand it or not. Scientists need to get better at communicating with the general public, and being more transparent about how research is done. I think open science will give us a chance to increase public engagement in science, which seems to be a growing problem in an age where we can all access the internet and a wealth of knowledge available on it.

I think that we need to figure out sustainable ways to fund the development of open software platforms to enable the next generation of researchers to push back the frontiers of science. We need to remember that we are publishing to share the results of (often publicly funded) research, and so we should be using liberal licenses such as CC-BY, CC0 that allow reuse and further analysis. We also need liberally licensed software that allow those same things, with simple licenses such as BSD and Apache 2.0. These libraries should contain well-tested implementations of data structures, algorithms and best structures, along with training for researchers to help them take advantage of these resources. If there is a better way to do something, contributions and integration should be encouraged as is the case in most open source communities.

Our Open Chemistry project recently got Phase II SBIR funding, and I am very excited to be leading that work at Kitware. It is part of a collaborative, open effort to improve the tools and frameworks available in the area leveraging new software processes to enable wider community involvement.

You Can Call Me Doctor Now (Almost)

Yesterday was the day of my viva (or viva voce in full). It is one of the reasons I have been so quiet recently - preparing for it. I am pleased to announce that it went really well and I passed. There is some paperwork and stuff I need to take care of but I am now a doctor (well, almost ;-) ). I did go out for a few drinks for most of yesterday evening with friends to celebrate which was great.

Thesis Submitted

Yesterday I submitted my doctoral thesis! I am really tired but pleased to have it in. Lots more to get done this week and I will hopefully get time to write about some of the other work I have been doing soon. I have my viva voce to look forward to at the start of September.

ePiX Updates - Easier File Reading

In the past few weeks I have been doing lots of data analysis and graph plotting. I use some graphical tools such as Grace, Veusz and QtiPlot but when doing lots of plots for a big document I wanted something a little more scriptable.

ePiX seemed to fit this bill and when I contacted the author about a few problems I had getting it to work he was very responsive. He is a mathematician whereas I am a physicist and have to handle lots of data in general. So I found the data file reading quite fragile. It does however have some great advantages such as being able to write LaTeX equations right into the graphs and figures.

Absorption profile plotted by ePiX

The image shown above shows a plot of the absorption profile of a gold nanoparticle suspension. In the recent ePiX 1.0.24 release I added a tokeniser to the file reading functions of the data_file class in order to make data file reading more robust and I was able to read all my data files in using a simple loop and plot them as shown.

I have also just finished a data pruning function for the same class, but I am not sure if there is a better way to implement it. It uses brute force right now to iterate through the data and erase unwanted points. It does work well but I am not sure if there is a better way to accomplish its goal using some of the STL algorithms. It does need to delete the whole row though.

  void data_file::prune(double min, double max, unsigned int col)
  {
    // Erase rows where the data is outside of the specified range
    std::vector::iterator> iter(m_data.size());
    for (unsigned int i = 0; i < m_data.size(); i++)
      iter.at(i) = m_data.at(i).begin();

    while (iter.at(0) != m_data.at(0).end())
    {
      if ( *iter.at(col-1) < min || *iter.at(col-1) > max )
      {
        for (unsigned int j = 0; j < m_data.size(); j++)
          m_data.at(j).erase(iter.at(j));
      }
      else
      {
        for (unsigned int j = 0; j < m_data.size(); j++)
          iter.at(j)++;
      }
    }
  }

This pretty much covers the extra bits I needed for data analysis. A nice legend function would be good as drawing legends isn't as automated as I might like. I would welcome any feedback on the prune function as I think this would make a good addition to ePiX.

My Office Chair Is No More

It would seem I have been working so hard on my thesis that my chair has given up the ghost and broke. I was pondering over some graphs in I was plotting for my thesis, lent back slightly and the plastic arm broke :-( I am going to have to go out tomorrow and find a replacement as I am getting back ache and don't need any excuses not to work on my thesis ;-)

They just don't make things to last any more... In related news Nick convinced me to join Facebook. Despite writing my random ramblings for the world on my blog, and putting up random pictures in my gallery I have never really bothered taking part in any social networking sites. I will see how it goes but don't intend to let it distract me from my primary goal right now - finishing my thesis!

Kalzium in KDE 4: 3D Molecule Viewer

Some of you may remember a post I made about Kalzium quite some time ago. In KDE 4 trunk it now has a 3D molecule viewer which is already looking pretty fantastic. I have been playing with it quit a bit, and the library where all the 3D molecular visualisation is destined to be kept so that other applications can use it.

Kalzium displaying a 3D rendering of methylbenzenethiol

I have been using ghemical to draw my 3D molecules and am rendering them in Jmol using the POVRay rendering stuff. This is a reasonable solution but I have to say I have found myself wishing for something a little more integrated and intuitive. Above you can see how Kalzium rendered my 4-methylbenzenethiol molecule which encapsulates the gold nanoparticles I spend so much time writing about right now in my thesis.

There is code in subversion that already does some basic molecular editing, geometry optimisation and rendering to screen. Thanks to OpenBabel it can load and save pretty much any chemical data file format too. As you may know this is the last time I am ever likely to be a student and I have put in an application to work on this over the summer with Google Summer of Code (TM). So with me luck :-)

I have been wanting to get involved in upstream KDE development for quite some time now and this seemed like an amazing way to start. The timing is pretty great too (although right now I am insanely busy finishing up my thesis and so the application period was hard to fit in) as I will complete my thesis before the summer starts and can delay getting full time work until after the summer. Don't worry - I have no plans to leave Gentoo and once I have more free time I will be spending more time on Gentoo again too!

Matter Compilers

In between working on my thesis I have been reading some really interesting posts on a blog from the EPSRC Ideas Factory on the "Software Control of Matter". To me and many others this is an extremely interesting area of research. Its director is Professor Richard Jones FRS who has his own very interesting blog as well as a very good book out on the subject called "Soft Machines" which I have read and would highly recommend.

The ideas factory is taking place this week and the blog poses some very interesting questions and the comments to the posts are well worth a read. I couldn't resist posting a few of my thoughts on what one might do with a matter compiler. My own slant leads me to hope that there will be an open source version available rather than some DRM crippled machine that only makes what the vendor wants you to make/sells you. There are obviously dangers associated with that depending upon how powerful the matter compiler is. I was also reminded that a matter decompiler is very different to a matter compiler by a subsequent commenter - I guess I was just getting ahead of myself as he is obviously correct!

I know I will be reading the blog entries and comments with great interest (even though I should probably be spending more time on my thesis). It also makes for an interesting new and open way for new research proposals and directions to be discussed by academic researchers. I for one think that this approach is great and would like to see more of it.

Coincidentally I also read about an open desktop fabricator developed by researchers at Carnegie Mellon University in the USA. This is a desktop fabricator you can build yourself, they have also developed software to control it and you can use several different materials to fabricate 3D objects at home. Not quite nano but still very interesting to see. They have set up a web site about the Fab@Home project that is well worth a visit.

SET for Britain Physics 2006 at the House of Commons

I got my train down to London at 7:14 I think it was, on Tuesday morning; bleary eyed and tired armed with my poster, invitation, small A4 posters and a camera (travelled light due to the security restrictions). Managed to get on one of the new Midland Mainline trains with all the snazzy displays and stuff on it. The train left on time and got in five minutes early, far smoother than previous journeys I have taken down to London. Once on the tube I spotted someone else carrying a poster tube and got chatting to her.

At the Westminster tube station we bumped into more poster tube carrying people all destined for Physics 2006. We went into the House of Commons via a side entrance. This was my first trip to the House of Commons and security was pretty tight with the metal detector arches, X-ray scanners and police officers frisking us on entry. Once inside I found out we weren't supposed to take photos and it became clear that the terrace marquee was nowhere near what I thought of as the House of Commons.

Marcus D. Hanwell at Physics 2006

I was in the second round of poster presentations, so once everyone had set up I walked round and looked at some of the other posters in the first round. There were some interesting posters being presented with some in similar areas to my own research. There was also one poster that was printed on A4 sheets and looked like a paper that had been printed and stuck up on string theory; I have never been to a poster presentation without at least one of these.

The poster boards were an unusual width which meant my A0 poster had to be cut down from the standard 84 cm width down to 75 cm. I have put a copy of it up here if you would like to take a look at it. It is a summary of the three main threads of the thiol encapsulated gold nanoparticle research I have conducted over the last three years. It was designed using the a0poster class and LaTeX using my favourite editor Kile on Gentoo using KDE. I have played a central role in this work but as with most work it has benefitted from collaborations with other researchers who have been credited as additional authors on the poster.

The poster boards were very closely packed in a zig-zag pattern. This made it very difficult when someone was looking at the poster facing mine and someone else came along wanting to look at mine and discuss it. When my poster judge came along she couldn't even see my poster whilst I was explaining it to her which did not help at all. I think the traditional straight arrangement would fit a few less in possibly but it would be far better at a busy event held in quite a small room.

There were several prizes awarded at the event but none of us from Sheffield won one. I did talk to some interesting people on the day. My MP Richard Caborn never replied to me or my coworker and the MPs in attendance seemed to just want to look at their constituents posters, get a few photographs taken and leave. I have never encountered MPs before and I did find this behaviour surprising, but I have only attended scientific conferences before this event.

I had a quick look around the bits of the House of Commons we were allowed in after the speeches and prize presentations. Then I met up with a friend, James (edit_21), for a drink and a chat before going for my train. It was nice to catch up as I haven't made it down to London in over a year. I ended up sat next to a chef on the way back up to Sheffield and had a really interesting conversation about everything ranging from scuba diving to careers and family.

It was a really long, tiring day. It was also probably the last trip I would make as a PhD student and so a little disappointing my poster wasn't better received. I got straight back into work on my return and have only just managed to get a day out of the lab today to get some more of my thesis written up.

Physics 2006 Poster Presentation at the House of Commons

I am going to Phyics 2006 poster presentation event at the House of Commons on Tuesday. I designed my poster using LaTeX and the a0poster class as I have for previous conferences. I am really happy with the poster and hopefully it will be well received at the reception.

The poster summarises the three main elements of my PhD work - structural characterisation, UV lithography and sensing/electrical characteristics of the thiol encapsulated gold nanoparticles I work with. I will put up a PDF of the poster after the event - don't want to give anything away until then. I have it here along with my invitation and train tickets.

It was a hectic day Friday getting it all printed and ready, making sure that it is narrow enough and that all the diagrams and gradients came out OK.

Great Week Away at UK Grad School in Windermere

I was away 8-12 May at a UK Grad school as part of my PhD. I went away not expecting very much from it at all, and I was so shocked. It was a really useful week away, I met lots of interesting people and learnt a lot about myself. I think it has really helped to get me in the right frame of mind to get my PhD finished and find a great job out in California! It has taken me this long to even touch my blog as I have been so busy at work setting my schedule, agreeing it with my supervisors.

I also found the time to sign up for the researchers in residence scheme and attended the briefing day at Sheffield Hallam University (the other one down the road ;-) ). We did lots of really useful stuff at the Grad school such as interview skills (great for me as I am job hunting right now), marketing, negotiation, constructive feedback (the level of honesty was very refreshing) and lots more besides. We also squeezed a lot of socialising in during the few hours they gave us to ourselves!

I now have an agreed schedule for writing up my thesis and papers. I still have a little lab time I can squeeze in and fingers crossed it should be submitted by the end of September fully complete. I will of course be using my favourite open source tools to write up with and was considering writing an article on open source tools and their use in a scientific PhD. Could be a good one to write while I am waiting for my viva...

APS March Meeting

I attended the APS March meeting in Baltimore this year. I didn't know anyone there, and was the only member of our research group out there but I got the opportunity to see a lot of interesting talks and posters. I also checked out the job fair, but it didn't have anything of any interest to me there unfortunately.

This was the biggest conference I have ever been to, with about ten or so parallel talk sessions running most of the time. Contributed talks were only 10 minutes long, and invited talks were usually 35 minutes long. There were a very large range of talks, and as it was more general than previous conferences I had attended I was able to attend a few talks totally unrelated to my research but of interest to me.

There were also a massive range of booths at the exhibition from a whole range of companies. It was interesting to see so many manufacturers, publishers and software companies displaying their latest and greatest products all in one place. The whole conference was on a larger scale than anything I have previously seen.

Some of the talks I saw revealed other groups working on some of the aspects I do but in quite different ways. Not all of this work had turned up in my searches of the published literature so I have quite a few papers I would like to look up once I get back to the office. I also talked to an interesting person on one of the booths about ADS which isn't just for astronomers. I have since used it and found it very useful for finding papers published in physics journals, and when combined with CAS it covers pretty much everything in my field.

First Few Days in Baltimore

Started off my journey on Friday morning at about 5am. Couldn't find my hair fudge but aside from that I think I managed to remember everything I needed. Got to the train station and caught the 5:20 train to London. Louise got me a seat in first class, but first class wasn't as first class as usual this morning. They did the usual barely drinkable complimentary coffee but no food was served. So went I got into London I was starving. Got the tube and the tube replacement bus to terminal 4 at Heathrow.

This was my longest ever journey travelling alone, before this I had just done a short hop to Grenoble, France! It is pretty lonely travelling alone. Security was the highest I have seen it on both sides of the Atlantic. I used the BA online checkin service, checked in my bag for the hold and then I thought I was done. Just had to go through security so I went to the red barriers and was directed to the back of the line. A few times along there I thought I had lost the line but it really was that long! I think I was waiting for about an hour to get through security, thankfully I had arrived with lots of time to spare. Continue reading "First Few Days in Baltimore"

Off to Baltimore in a Few Hours

Well I am going to set off in a couple of hours to catch the train down to London, Heathrow is such a hassle to get to from Sheffield. Looking forward to it but paranoid I have forgotten something I will need. BA only let you take one piece of hand luggage which is going to have to be my laptop, so unfortunately my SLR camera has to stay at home. I will be taking my small Fuji F10 but I would really like to be able to use my Nikon F80 out there. I don't trust it in checked baggage though.

If there are any Gentoo folk who fancy meeting up for a beer I should be able to check mail at the hotel. The conference looks huge - it will be a real challenge taking it all in and seeing as much as possible whilst I am out there. Best continue running round the house trying to check I haven't forgotten anything now!

Baltimore Trip

Well I have booked my flights and hotel room, joined the American Physical Society and registered for their March meeting in Baltimore. I am feeling broke but really looking forward to seeing some of the American East coast (only ever been to California in the US). I am also looking forward to the jobs fair held in parallel to the conference. I will be polishing my CV and registering for the it pretty soon.

Managed to arrange for my wife to join me after the conference so that we can look around the area and make a bit of a holiday out of it. This is going to be a tough year writing up my thesis, searching for jobs, finishing up my current research and getting it published. I am really looking forward to looking around Washington DC too as it looks like a really amazing place. Drop me a line if you are in the area and fancy meeting up for a beer one night or have any recommendations of sights we should taken in.

Won Joint First Prize for My Final Year Talk

Just given my talk (on my birthday no less) and I won joint first prize for it, so I am really pleased. I wrote it all in Linux using Kile, LaTeX and LaTeX beamer. You can look at a copy of the PDF here if you would like to. It is a summary of two of the main areas of my research - alkanethiol encapsulated gold nanoparticle Langmuir-Blodgett/Langmuir-Schaeffer multilayers, and the sensing/conductivity characteristics of these multilayers to various gases and vapours.

I just have to get more of my work published, write up my thesis, find a postdoctoral position (hopefully in the States or Canada) and pass my viva! Shouldn't be too hard, for now I am going to take the evening off and enjoy the rest of my 26th birthday ;-)