Stuart Chalk

Meeting a New CINF Volunteer Extraordinaire

Svetlana Korolev, interviewer

After engaging for six years as an editor of the Chemical Information Bulletin I wanted to continue the “Meet your new CINF functionary” interview series (http://www.acscinf.org/content/interviews). For this issue I asked Professor Stuart Chalk to talk about his growing involvement in the Division of Chemical Information. In 2016, Stuart became a new coordinator of the CINF Scholarship for Scientific Excellence program, an assistant Webmaster (and now Webmaster); a co-organizer of the first CINF Data Summit (http://www.acscinf.org/content/cinf-2016-data-summit-251st-acs-meeting-san-diego); a co-author of the Fall 2015 InterCollegiate Cheminformatics Course (http://olcc.ccce.divched.org/Fall2015OLCC), with a follow-up symposium about this course in the Division of Chemical Education program (https://ep70.eventpilotadmin.com/web/page.php?page=Session&project=ACS16spring&id=208333, https://ep70.eventpilotadmin.com/web/page.php?page=Session&project=ACS16spring&id=216599); and a co-author of ten presentations at the spring ACS national meeting in San Diego. In December 2015, he participated in multiple symposia at PacifiChem. Antony Williams, Erin Davis, and Stuart Chalk have formed a task force spearheading outreach activities for CINF.

Stuart Chalk is an associate professor of chemistry at the University of North Florida. He earned a B.S. degree from Loughborough University (U.K.) and a Ph.D. degree from the University of Massachusetts at Amherst (U.S.A.). Dr. Chalk is trained as an analytical chemist with expertise in flow analysis methodology and instrumentation. Over the last fifteen years he has morphed into a cheminformatician working on research projects to develop data standards (the Analytical Information Markup Language: AnIML; Common Standard for eXchange: CSX; Experiment Markup Language: ExptML); electronic laboratory notebooks (the Eureka Research Workbench); scientific ontologies (the Chemical Analysis Ontology: CAO); and scientific data representation.

Svetlana Korolev: Stuart, let’s start at the beginning of your career path. Please take us back to the time when you realized you wanted to be a scientist. Can you name people who influenced your career? How did you become interested in cheminformatics next?

Stuart Chalk: Coming out of high school (in the United Kingdom) I knew I wanted to do something technical. I really liked math and logic, but I did not want to be math teacher (the only job I thought you could do with math at the time), and I was not interested in a career in physics. I chose chemistry because it was something I liked and, although it was not my best subject, I could see myself working in a lab. This was heavily influenced by my high school chemistry teacher, Mr. Hunt. He let me be the stockroom prep assistant for his classes, and it gave me a great deal of pride.

I picked an undergrad degree in Chemistry with Analytical Science and graduated in 1984, having done an internship as part of the degree that made me realize I was absolutely in the right area. I started getting into computers and started using a Mac in my senior year, which really turned me on to computing. The research advisor I worked for, Julian Tyson, helped me see how research fit in with the grand picture of chemistry, and I was excited when I graduated and got my first job at The Wellcome Foundation (as it was then) in Dartford, Kent.

It wasn’t long, though, before I was starting to think about graduate school, and, after a few months, I went back to see Prof. Tyson to talk about doing a Ph.D. He emigrated to the United States and I followed to do my Ph.D. During my Ph.D., I found I like collecting citations and ended up doing lots of scanning of references and optical character recognition (OCR). Then I organized them into this new software called EndNote. From then on I was hooked on informatics as a hobby … at least until recently.

SK: You relocated from the United Kingdom to the United States of America in 1989. Could you walk us through what it was like? Please share your viewpoints of the academic systems in two countries. Are there any noticeable distinctions in the ways how students study, conduct research, or seek for scientific information?

SC: Retrospectively it was a courageous thing to do on my own, but at the time it was an adventure and I had nothing to lose. Certainly, the two academic systems are different, with the United Kingdom being much more prescribed in terms of the classes. There was no “picking a major” when you get to college. I picked my undergraduate course when I was in the last year of high school. The levels of educational advancement were also different. Students in graduate school were not as far along in the United States, and this helped me do well my first year as I repeated some classes I had done in the United Kingdom. As for doing research, there was not much difference in the approaches between the two systems, with the exception that it was generally easier to get inter-library loans in the US.

SK: At “The Growing Impact of Openness in Chemistry: a Symposium in Honor of J.C. Bradley” during the fall 2015 ACS meeting you introduced the beta version of the Open Spectral Database (OSDB, http://osdb.info), and next talked about this project as well as the NIST- IUPAC Solubility Data Series (https://sds.coas.unf.edu/) and the Flow Analysis Database (FAD, http://www.fia.unf.edu) during PacifiChem in December 2015. Please tell us a little bit more about your current research projects. Why do you work on them? How does your research fit in the larger scheme of the field? Do you have collaborations through CINF?

SC: My initial foray into informatics was based on an interest in my research field. The OCR work I mentioned before turned into a Web site of citations on the general area of flow analysis. The scope of this spans the areas of flow injection analysis (FIA), sequential injection analysis (SIA), zone fluidics (ZF), and post-column derivatization for HPLC. The Web site (https://www.fia.unf.edu/) is still available, although these days it looks terribly dated. I am working on a new version (currently at https://chalk.coas.unf.edu/fad) that will replace the old site over the summer and uses all the current technology and provides an application programming interface (API).

The initial work on the FAD server was done using an Apple Server, Lasso, and FileMaker. This first Web site got me into a grant from the NSF NSDL (National STEM Education Distributed Learning, 2001) program to develop the first Analytical Sciences Digital Library Web site (ASDL, http://asdlib.org). In doing that I learned how to use Apache, PHP, and MySQL, which I liked much better as a development system (and it fits my budget: it was free). This also got me into Dublin Core and implementing an OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) interface to allow the main NSDL site to harvest the collection from the ASDL.

My current motivation is grounded in the firm belief that although it takes time and effort there are a lot of research data out there that are high-quality and deserve to be picked out of the research literature haystack and made available, searchable, and re-usable. This is why I am working on projects with SpringerMaterials extracting property data from the Landolt-Börnstein database, and NIST to “semanticize” (if that’s a word) the existing IUPAC Solubility Database Series volumes that were originally digitized twenty years ago. These are quality data sets and, in today’s environment, difficult to use, as they are electronic (PDF) but not useable in an informatics sense. Yes, there is a lot of tedious work to accurately capture the property data and annotate its context well enough to make it useful, but it’s worth it. Good data are worth it; they are a valuable commodity in today’s big data marketplace, and scientists need them for many different applications.

As for CINF collaborations I don’t yet have any formal ones, as I am new to the community. However, CINF has been a revelation for me. By that, I mean for years I have attended Pittcon (Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy), presenting papers and posters on the informatics work I was doing. No real interest. Then, Bob Lancashire of the Joint Committee on Atomic and Molecular Physical Data (JCAMP) fame invited me to give a talk at the ACS meeting in Indianapolis in fall of 2013. Wow! There was a group of people that liked the things I was doing, and they were doing really cool stuff. It was a career moment; I had found a real home. Since then it has been a race to catch up with the community and now I am hitting my stride. Of course, I have also been encouraged by the most charismatic of CINF members, Tony Williams, to work on cool projects. The OSDB (Open Spectral Database) is one of those: instigated over a beer in Denver. Tony said: “We need open spectra, period.” I said: “OK,” and went away to build the Web site in a few months. It is all about making data accessible and usable; why else would we put it on a Web site? I am also interested in molecular structure representation, scientific units, ontologies for chemistry, electronic lab notebooks, bibliometrics… too many interesting things…

SK: At the University of North Florida (UNF) you teach a Senior Seminar and courses on Modern Analytical Chemistry, Quantitative Analytical Chemistry, Structure Elucidation, General Chemistry, and Chemical Informatics. In 2004 you were recognized by receiving a UNF Outstanding Undergraduate Teaching Award. Your students posted some fascinating “rate my professors” comments about you, for example: “He has a wonderful sense of humor and an extraordinary grasp of his subjects! He is overly competent and truly prepares his students for their majors… if you like sci-fi and British humor, he is definitely the professor for you.” Could you share with us some of your brightest teaching moments that make students praise you so highly at UNF? Also, please talk about the Fall 2015 InterCollegiate Cheminformatics Course.

SC: I have been lucky to be a very different type of person in the Department of Chemistry at UNF (this is my 20th year). By that I mean, I am really not a hard-core chemist, not naturally good at chemistry. This means I have always had to work at learning and understanding chemistry, and I think it’s that background that helps me relate to students. I try and bring professionalism and organization to the classroom, with high expectations, but I also realize that to communicate chemistry effectively you have to be engaging, honest and real. In addition to having a wry sense of humor and a British accent (a common joke I make in class is “I don’t have an accent – y’all have an accent!”), I regularly do strange things in class or make weird analogies when I see students eyes glaze over. If you create a ‘memorable’ moment, students remember that. At some point I might actually do step aerobics in class to talk about energy transformations, but I’m not quite courageous enough yet!

I am lucky to have had many meaningful teaching moments in my career so far, especially all those of my research students that have gone on to graduate/professional school, and others coming well after the fact. One example is a chemistry major that, a few years after graduating, contacted me and wanted to let me know that he really appreciated what I did for him. In particular, at the end of our senior level Instrumental Analysis course, I gave the student an A-. He remembered asking “Why didn’t I get an A?”, to which I replied: “Because your work wasn’t A quality.” It made him really think about what quality meant and used it to great advantage in getting a job at Anheuser-Busch. He has risen through the ranks and is now a Global Senior Quality Assurance Manager there.

More recently, last month in fact, out of the blue I got the following email from a student who was at UNF for only one year.

“It's been awhile. I can guarantee that you won't remember me, but you were my general chemistry professor in the fall of 2011. I am writing you today because I want to thank you. On May 7th, I will be graduating with a degree in chemical engineering from the University of South Florida and I owe it all to you... I sat in your class the first day of the fall semester completely hating the fact that I had to take general chemistry. My high school teacher ruined the subject for me. After taking your course, I learned to love chemistry in a new way, which I attribute that [sic] to your teaching style. It was in that classroom that I first learned of chemical engineering as an option. I decided to leave UNF and change my major from finance to chemical engineering… Anyway, I wanted to thank you for changing my life's path to something I would have completely disregarded without your course.”

As you might expect it is this kind of feedback that makes teaching so rewarding, and makes it easy to continue even though you’ve been doing it a long time. But doing new things is also good. For me that has been teaching Chemical Information Science as a class to juniors and seniors. I’ve done this for a while but the class has morphed in terms of content, which goes along with there not really being a good definition of what the topic area is exactly. Luckily, I got to meet Dr. Bob Belford who has been running the OLCC (OnLine Chemistry Course) funded by NSF for a while. He put together a group of faculty to develop different modules for the course, and we taught it as a “flipped” style class. Students at four campuses participated and each one of them got a lot out of the experience (and many presented at the ACS meeting in San Diego this last March). Likely, this will end up as course material on the ChemWiki, and we will teach it again in the next couple of years.

SK: How do you satisfy your own chemical information needs? Which databases, search engines, or current awareness tools do you use the most?

SC: I use the common ones like PubChem, ChemSpider, Chemical Identity Resolver, and CrossRef. I also like to build Web sites to serve up data and for that I use Apache, MySQL, and PHP. I also use JavaScript and Bootstrap for the GUI (graphical user interface). I am very interested in using JSON-LD (JavaScript Object Notation for Linked Data) for storing and semantically representing chemical data.

SK: Do you have plans to write a book? If so, what is its subject?

SC: I’ve had a plan to write a book for the last five years. It’s called “XML, Metadata and Markup Languages for Chemists.” I have had trouble writing it because there are so many interesting things to do, but if I were to start one right now it would be “Semantic Technologies for Chemists.” Hopefully, someday I will get to do both of these.

SK: Stuart, you are now actively contributing to the Division of Chemical Information technical program and various functionary activities. What was your first encounter with CINF? Are you a member of any other ACS divisions or professional societies? What do you enjoy most about your involvement in our division? Could you speak of some initiatives considered by a new outreach taskforce?

SC: My first encounter with CINF was at the fall 2013 meeting in Indianapolis where I met Tony Williams. Tony has been a great mentor to me since then and I have learned a lot, especially what a “skunkworks” project is. I am a member of the Analytical Division of ACS, the Royal Society of Chemistry (RSC), and ASTM, and I hope to soon be involved with IUPAC. The CINF division has been very welcoming to me and it has been great to talk to people that I can really relate to. The best (and worst) part about this I can’t seem to say “No” to projects that people propose. As for the outreach taskforce, that is still in its infancy, but we are definitely looking to identify members of CINF who can liaise with other divisions and we would like to organize symposia at meetings in other divisions to get the word out.

SK: Would you like to talk about the CINF Scholarship for Scientific Excellence program? What made you volunteer for the position of its coordinator? Has everything worked out as planned for San Diego? Was it highly competitive for the applicants and a tough decision for the jury? Do you have long-term commitments from its sponsors (e.g. InfoChem/Springer for spring and RSC for fall), or do you have to find funds for the next year?

SC: I thought that the CINF Scholarship for Scientific Excellence was a way that I could contribute to the CINF division without taking on a large burden. Thanks to Guenter Grethe’s advice, everything went well and we had a great group of submissions. Picking the winners was not so easy because of the variety and quality of submissions, but I think we got it right. ACS Publications has kindly agreed to sponsor the Scholarship Program for the fall meeting in Philadelphia, and we are hopeful they will continue it further.

SK: The Division of Chemical Information was featured in the ACS National Meeting highlights for its first Data Summit organized for five days in San Diego. Please share some highlights of your symposium “Chemistry, Data, and the Semantic Web: An Important Triple to Advance Science” organized in collaboration with Evan Bolton. If you were to give “The Best Presentation Award,” would you name a special talk at the Data Summit?

SC: Wow. That’s a tough question. I would probably pick Michel Dumontier from Stanford, who gave a talk on semantic approaches for biochemical knowledge discovery. In his talk he clearly demonstrated the tools and approaches that chemists need to work toward in order to really make knowledge discovery. I particularly liked his question: “How can we automatically find the evidence that supports or disputes a scientific hypothesis using the latest data, tools and scientific knowledge?” This is the crux of what we in informatics are trying to do: encode meaning to information so that computers can infer relationships from large data repositories using semantic technology. All the infrastructure is there; we need to translate chemical knowledge across to the digital area using appropriate ontological representation. I believe we will have this in the next ten years if we make quality research data open and reusable to the community.

SK: Which symposia and presentations are you planning for the Fall 2016 ACS National Meeting in Philadelphia?

SC: Sadly, I won’t be attending the fall ACS meeting this year. It is the first week of fall classes at UNF and thus bad timing for me.

SK: You joined a small taskforce for investigating a new host for the CINF Web site last fall. Would you like to share any updates of your investigations? Are there any other challenging issues with the CINF Web site?

SC: All I can say right now is that the current website is in dire need of a revision. I think those of us looking into this see a lot of opportunity to use a new website as a way to engage the community, reach new members, and provide tools for members to be able to show off aspects of their work and interests in a practical way. We also need to capture more data about our members, their interests, needs, skills, and that is an obvious way to give our members more for their membership.

SK: Stuart, let me finish our interview by asking a few personal questions. What are your favorite activities outside work? How would you spend your “dream” vacation? What kind of music do you like to listen? Is there anything else I did not ask that you want to add?

SC: I’m a runner, which helps me unwind, and I love to play golf. As for a dream vacation: some place with beaches, golf courses, and plenty of good (dark) beer. Musically I am quite varied, all the way from electronic (Jean-Michel Jarre), to hard rock (Rush), to 80’s (Abba). I love to sing and whistle along to tunes at work and my colleagues are very patient with me. I might try singing in a band at some point; who knows?

A parting comment would be this: We need to show that informatics in chemistry is important. Not to ourselves, but to the real bench chemist. Once we get folks to realize that what we can do for the chemistry community will make them more efficient, more knowledgeable, and improve the quality of the work they do, we will really be able to move chemistry into the digital era.

SK: Thank you for sharing your expertise with the Division of Chemical Information. Best wishes for all your endeavors.