November 6, 2019
Assistant Professor, Digital Archivist and Records Manager
Assistant Professor, Science Librarian
An Introduction to the Issue
When it comes to supporting the range of Research Data Management Services (RDMS) that campus communities need in the world of 21st-century scholarly communications, smaller academic institutions are at a major disadvantage. According to an Association of Colleges and Research Libraries (ACRL) survey of more than 200 academic libraries in the US and Canada, a significantly higher percentage of libraries at institutions with more than 5,000 students offered both consultative/informational research data services (RDS) as well as technical/hands-on research data services as of 2012 (ACRL, 2012, p. 20). Furthermore, survey responses indicated that NSF-research-active campuses and those with multiple doctoral programs – in other words those spaces where big-money, high-stakes research takes place — are also more likely to have established RDS programs (ACRL, 2012, pgs. 22-25).
To most, these results are unsurprising. However, readers might be surprised to know that the same survey indicated that library leaders at both large and small institutions recognized (in comparable numbers) that their libraries need to plan to offer RDMS within the next two years (before 2014), even if they currently did not, in order to “engage the data deluge [of a] new data intensive environment” and remain relevant to campus partners (ACRL, 2012, p. 21). Exactly how libraries at smaller institutions should go about establishing these critical (or even basic) research data services – some (though not all) of which require robust technology, funding lines, and designated personnel with special data analytics skills – remained unclear in this early white paper.
This is the exact scenario that we (a Science Librarian and a Digital Archivist) find ourselves in: at a highly-ranked, historic, liberal arts college of about 3,000 students and 250 full-time faculty, all of whom are finding, collecting, and sharing research data without the support of the library. Together and slowly, we have been putting our heads together to try and solve a particularly perplexing component of the RDMS dilemma — how to scale down the model and still provide quality and sustainable services that suit the needs of a particular community? After about two years of effort, we can see that our advocacy and outreach will continue for many more years before we have a solid foundation for RDMS on our small but active campus of dedicated researchers. Though we are early on in our journey, our hope is that the actions taken and lessons learned outlined here will help others working in similar library settings and facing similar challenges.
Questioning the Existing Narrative
In addition to the obvious limiting factors of less funding and staff resources experienced by smaller institutions, there is another almost invisible barrier that librarians face when trying to get an RDMS program off the ground – a general belief that there is no need or demand for “that kind of service” at “a school like ours.” This notion is part and parcel of the widely-held assumption that liberal arts colleges (and other small schools) are “merely teaching institutions” that “do not have the resources to support serious scholarship” and do not have research expectations of faculty (Ghodsee, 2008). Of course, this notion is simply wrong and has been refuted in visible academic media outlets over the last decade (Ghodsee, 2008; Markin, 2008; Ruttimann Oberst, 2010).
The idea that real research and datasets are not components of a liberal arts college was equally perplexing for us to hear at our institution because we, as tenure-track faculty librarians ourselves, are expected to engage in rigorous research and scholarship (that is often founded in data and data collection and usually required for tenure) in the same manner as the other faculty on our campus. Furthermore, as liaison librarians who are well aware of the ongoing grant proposals and research agendas of our fellow faculty members, we knew of countless “data stories” (as they’ve come to be called) that clearly indicated a need for more institutional support with RDMS.
So, what were we to do about this dirty little secret? In 2017, we decided it was time to document these data stories by talking with data stakeholders on campus and asking them what the library could do to better support their professional data needs, be it research, collaboration, teaching, or other more specific scenarios like large data storage or data cleaning and processing. What follows here is a brief explanation of what we discovered and uncovered in our investigation and how we are still working to address identified data needs on our campus.
Ask and Listen
From the get-go, it is important to realize that “data” has varying definitions, connotations, and meanings across the disciplines, some positive and some not-so-positive. In most cases, these working definitions are grey in nature; not black and white. We cannot stress enough the concept that Robin Rice and John Southall also note in the introductory chapters of The Data Librarian’s Handbook — librarians need to be aware of the varied feelings and nuances around and about data for their stakeholders and develop, as early as possible, a working definition of what they consider “data” in their unique context (Rice and Southall, 2016, pgs. 19-20). So, when we approached faculty on our campus about their data and data practices in Spring of 2017, we were very careful in the terminology we used and the questions we asked. We were also sure to include faculty from almost every division on campus (see Table 1) to paint as inclusive and interdisciplinary a data narrative as possible within the confines of our small sample size (only 16 individuals).
Disciplines of faculty focus group participants
|Science||Applied Social Sciences||Social Science||Humanities|
|Physics||Communications||Political Science||Art (Sculpture)|
In the small focus groups we set up (5 cohorts of 2-6 faculty), we asked participants five overarching questions (see Table 2) to: (a) try and get a sense of how they deal with data in their academic workflows; (b) help us understand how the library could potentially provide support for this work in the future. We kept the conversations collegial and informal and allowed the discussion to progress naturally; we offered “snacks” as incentives and used a convenience sample strategy for recruitment. Our goal was to simply establish a baseline for our ongoing investigation about local data stories, needs, and unexplored opportunities. Although we did not create transcripts, we did record all of these focus group sessions and we coded the audio afterwards to get a sense of how things “boiled down.”
Table 2 – Central questions asked of all focus group participants
|What is the nature and format of your research data?|
|What is your process for creating, updating, and saving research data?|
|What is your perspective on Open Data and (how) do you share your research data?|
|What is your experience with discipline-specific data repositories and Research Data Management Plans?|
|Describe your current challenges with research data, and explain how the college could better support you in that work?|
So what did we find out about the data practices of our faculty participants? A few important trends emerged. Most faculty we spoke to are true independent researchers — they are the only one who does their kind of research on campus, so they don’t talk to their departmental colleagues about their data or how they manage it; they are, in a sense, lone data wolves navigating a wild west-like data landscape. As such, these professors are relying on their own private machines, hard drives, instruments, and MONEY to store and backup their professional research data. Most faculty we talked to feel IT does not have a role in their research data management efforts (some are very opposed to that idea, actually). The exception is that they are very happy to be provided with critical data analysis and visualization instruments like NVivo and SPSS, the funding for which comes out of campus IT’s budget.
Another major takeaway — many faculty rely on discipline-specific and open data repositories for their work, however few said that sharing their own data was a priority. They gave a range of very logical explanations for the latter — privacy and confidentiality requirements from governing boards, concerns about incomplete de-identification of research participants, regulations around privately obtained data sets, fears of their research being “scooped” in advance of publication, and even a general absence of standards in their discipline that would require citing or sharing data sets as a recommended or regular practice. The strongest data sharing trends we were able to identify among our participants seemed to be between research collaborators at other institutions, with whom they found ways to share their data out of necessity, but not always easily. Several noted that these external collaborations were usually logistically difficult even if the partnership was drawn-out over the course of many years, and on at least one occasion an off-campus data partner experienced corruption and data loss.
Finally, and most critically, focus group participants were challenged by the task of training students in data literacy and manipulation, regardless of their discipline or preferred methodology. We found this a bit unexpected but of course as a teaching-centered institution, we really should not have been surprised. These fruitful conversations revealed that providing access to and teaching with appropriate data sets for undergraduates was shared experience and difficulty for our dedicated cohort of teaching faculty. Professors wanted students in their classes to have a secure place to store data and return to it asynchronously as needed, even throughout the life of a course or a major. Selflessly, these faculty often provided their own research data sets to students as learning tools when they coached them through various disciplinary methodologies. This kind of effort is not uncomplicated and usually requires several computer applications. For example, they might need to set up a README file in one space; store the data in another secure location; use a specific tool for presenting, cleaning, and sorting data; and then sometimes a fourth application might be needed for visualization purposes. Such a scenario is a major undertaking for the faculty member and represents a huge commitment in terms of class time.
A few professors also struggled to make students aware of publicly available datasets (the very ones they also use as academic researchers) that could benefit class or independent research projects. Some expressed an interest in a local campus repository where public or “locked” datasets could be housed, accessed, and build up year after year for the benefit of both faculty and students, though others felt that was unrealistic. Finally — and this was perhaps the biggest “ask” of all — some faculty we interviewed wanted more instructional support for their data-intensive courses, specifically in the form of a data specialist who could expertly train students in data analysis tools and methods. Another interviewee was excited about the idea of having “data” tutors for certain departments, in the same manner that the library supplies writing consultants or subject tutors. That concept had certainly been discussed within the library before, but (like the request for a professional data expert) such an initiative would require a significant amount of library funding and resources.
Start with the Easy Stuff
Once we had all this information in hand, we asked ourselves — what are no-cost, little-effort solutions to some of the data problems we heard from our interviewees? In other words, what can we do right now with the resources currently we have? Over the last two years we’ve found at least five areas of opportunity for immediate, no-cost improvements for our small but busy research community.
Create and Share Online Resources
A logical first step for us was to compile free, online, data-friendly resources and share them with our users in a curated and easy-to-navigate space. Our solution to this was an online research guides (LibGuide). Most RDM LibGuides we found explain and link to best practices, including guidelines for dataset naming standards and data versioning. They also typically have criteria and “tips and tricks” for designing a Research Data Management Plan. Many provide links to discipline-specific databases and explain common formats, tools, and file types for data in those disciplines. While some focus on data visualization and speak to more of a popular audience, others are clearly designed for a high-level specialized researcher who understand scientific jargon and methodologies. In our own guide we connect researchers with local campus experts on data management like the IACUC, IRB, and grants office. We hope to promote and gauge the use of these guides in the future to determine how helpful they are for our community of researchers and how they might be expanded or improved.
Educate Ourselves in Research Data Best Practices
One of the top five recommended actions items for library directors that came out of the 2012 ACRL survey was supporting library faculty and staff in professional development that provides knowledge of RDS, even if there is not a formal library RDMS program in their library. (ACRL, 2012, p.40). We are fortunate to have professional development funds to support our librarianship training, and in the last two years we have utilized much of that funding for a gamut of RDM professional development activities ranging from online MOOCS and webinars to week-long trainings and one-time workshops. We’ve also made an effort to connect with others working in the field who are trying to tackle similar problems; for example, we presented at IASSIST and actively follow several online RDM listservs. We see our professional development efforts as a “train-the-trainer” approach, wherein we tap into the knowledgebase of our professional community and then bring that new knowledge back to our institution to be applied and shared with our colleagues.
Participate in the Data Lifecycle
As librarians, our work requires us to meet people where they are (as individual researchers and within the context of a class) with their data needs and help them work either backward or forward to accomplish their goals. In these scenarios we typically find ourselves partnering with our faculty and undergraduate researchers at the beginning and end of the data lifecycle (See Figure 1). For example, in the planning and acquisition stages of the research project PIs reach out for assistance as they design a Data Management Plan or determine what library and other available resources they can utilize to strengthen their project/proposal. The library is also a logical partner at the end of the data cycle when researchers are faced with the problem of preserving and sharing their data outcomes in appropriate ways.
Figure 1 – The United States Geological Survey Science Data Lifecycle Model. https://doi.org/10.3133/ofr20131265
However, it’s been argued that there are many more opportunities for library partnership at other stages in the data lifecycle, specifically with the data processing and analysis stages. NIH Informationist Lisa Federer argues that librarians can and should provide expertise in new areas such as data visualization and data “wrangling” (a term for reuse, migration, and massaging of data to suit novel research needs) (Federer, 2016, pgs. 39-41). We look forward to considering how our librarianship can be more fully embedded into all stages of the data lifecycle, to the benefit of our community of researchers and beyond.
Piecemeal, Peripheral, and Point-of-Need Efforts
Acutely aware of our “multiple hats” as faculty librarians and the sometimes-serendipitous ways those roles overlap (Benjes-Small and Miller, 2017, p.47), we’ve been on the lookout for opportunities outside of the library to gather more information and input about research data practices and experiences on campus. We didn’t have to look far. For instance, as a member of the IRB, Rachel advises on ethical storage and destruction of research data. She also serves as campus Records Coordinator, writing policy that incorporates national and state-wide requirements for the retention, destruction, and storage of research records and data by any grant-funded PI. In addition, Patti serves on the campus Curriculum Committee where she keeps her finger on the pulse of what different departments provide for discipline-specific, undergraduate research methods courses, and she asks about data pain points experienced in theses classroom environment. These sidebar discussions have been absolutely essential in giving us a wide-angle lens to view RDM practices and issues in our local context.
Planning Outreach and Programming
We do not yet have the bandwidth or “buy in” to properly market the above resources and services we’ve discussed here, so outreach initiatives are still in their infancy, but we are working now to develop a roadmap for that important work. We are especially encouraged by the plethora of exciting programming and outreach strategies that libraries both big and small are utilizing to promote RDMS. For example, the newly popularized social media campaign around “Love Data Week” has had immense success. Some campuses have also had success with regular and open “data labs” or “data office hours” for student researchers, while others have focused primarily on supporting faculty needs and education, for example Project TIER at Haverford College made enormous gains with faculty programing. In terms of the latter, we are hopeful that in the future we can work with our Center for Faculty Development to lead mini-workshops or info sessions about RDM resources and best practices to the benefit of our professoriate.
How do we begin to tackle some of the bigger, tougher RDMS obstacles that faculty spoke to us about in our informal focus groups? In the spirit of data-driven decision making, we believe that starts with gathering more information from campus stakeholders and in more formal ways. Therefore, we are readying for a secondary wave of data collection from our stakeholders, this time in the form of a full, IRB-approved survey with a larger sample size of faculty participants. In addition, we plan to do more systematic interviews with all instructors at the college who teach the data analysis and methods courses for their respective disciplines; they represent an untapped reservoir of knowledge about how faculty and student researchers use, and are challenged by, data on a regular basis.
This knowledge and feedback can help us advocate for more program-level and large-scale improvements that would require dedicated personnel or new funding lines. For example, hiring an RDM Services Coordinator and assembling an RDMS Team of librarians, IT professionals, and other critical campus experts (Rice and Southall, 2016, p.67). Other recommended critical steps for making large-scale change in the arena involves writing an RDM policy that suits the needs of our unique campus community and developing a data literacy curriculum specifically for our undergraduate researchers. Since these lofty goals often feel like massive hurdles to us, we like to take courage with a more positive line from the Data Librarians’ Handbook (2016) and we hope our readers will as well:
This could be an opportunity for you (and the library) to demonstrate leadership by talking about the issues, getting people organized, and taking on pilot projects to establish evidence for the need of more resources for RDM support and services. (p.68).
Tenopir, C., Birch B., and Allard, S. (June 2012). Academic Libraries and Research Data Services: Current Practice and Plan for the Future, An ACRL White Paper. Association of College and Research Libraries. Retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/Tenopir_Birch_Allard.pdf.
Ghodsee, K. (April 25, 2008). A Research Career at a Liberal-Arts College. Chronicle of Higher Education. Retrieved from https://www.chronicle.com/article/A-Research-Career-at-a/45760.
Markin, K. M. (February 19, 2008). Big Research, Small College. Chronicle of Higher Education. Retrieved from https://www.chronicle.com/article/Big-Research-Small-College/45957.
Ruttimann Oberst, J. (September 10, 2010). Big Thinking at Small Universities. Science Magazine (online). Retrieved from https://www.sciencemag.org/features/2010/09/big-thinking-small-universities.
Rice, R. and Southall, J. (2016). The Data Librarian’s Handbook. London, Great Britain: Facet Publishing.
Faundeen, J. L., Burley, T. E., Carlino, J. A., Govoni, D. L., Henkel, H. S., Holl, S. L., Hutchison, V. B., Martín, E., Montgomery, E. T., Ladino, C., Tessler, S., and Zolly, L. S. (2013). The United States Geological Survey Science Data Lifecycle Model. USGS Publications Warehouse. Retrieved from https://doi.org/10.3133/ofr20131265.
Federer, L. (2016). Research Data Management in the Age of Big Data: Roles and Opportunities for Librarians. Information Services & Use, 36(1-2), 35-43.
Benjes-Small, C., & Miller, R. K. (2017). “The Hats We Wear: Examining the Many Roles of the New Instruction Librarian.” American Libraries, 48(11-12), 46-49.