Quantitative Data Skills for Undergraduates: Training Social Science Students to Work with Data

February 15, 2021

Amelia Kallaher
Applied Social Science Librarian, Mann Library
Cornell University

Whitney Kramer
ILR Research and Data Librarian, Catherwood Library
Cornell University


While many university research data management services focus on supporting faculty and graduate-level students, we identified a gap between what undergraduates needed with regards to quantitative data support and what they were typically receiving. We found that while upper-level undergraduate students were taught the statistical analysis skills they needed, they were often not given any guidance on where to locate data for their projects. Faculty assumed students had been taught these data literacy skills by the time they began working on a senior honors thesis or extracurricular research. To address this, we developed a virtual seminar for undergraduates consisting of six 60-minute data-focused sessions with an accompanying Canvas course.


Cornell University Library (CUL) is made up of 20 distinct libraries serving an undergraduate population of just under 15,000. Catherwood Library, which serves the Industrial and Labor Relations (ILR) School, is the most comprehensive resource on labor and employment in North America. Mann Library, serving the College of Agriculture and Life Sciences and the College of Human Ecology, is the home for research and discovery in the life sciences, agriculture, applied social sciences and human ecology at Cornell University and the second largest library on Cornell’s main campus. Both are New York State land-grant libraries, supporting not just the needs of the university, but the research needs of citizens of the state of New York.

Deciding to Focus on Undergraduates

While at Cornell, undergraduates who participate in an honors program have an opportunity to complete an honors thesis during their senior year. This is an original, independent research project formatted to mimic a scientific paper or research article in the field of their study. Some colleges require a specific research methods class while others use faculty mentors or an independent study course. Since each college has different requirements, timelines, and deadlines for the thesis, this creates a challenge for offering a uniform for-credit course or workshop.

We realized a single one-shot instruction session, whether taught for a for-credit class or as a separate library workshop, would not sufficiently support these students. We opted to create a longer seminar to address multiple concepts that an upper-level undergraduate might need in order to successfully find and organize quantitative research data for their senior thesis. We assumed that the students enrolling in our seminar were not enrolled in a for-credit research methods class and were starting (effectively) from scratch. 

Overcoming Challenges

Pre-pandemic, we had conceived of this as a traditional, in-person workshop series, with hands-on practice locating datasets and software tutorials. When Cornell shifted to remote instruction in March 2020, we decided to plan for an online seminar. We realized it made more sense to focus on the theoretical elements of working with data, rather than working with statistical software as this content is commonly covered in statistics classes or in workshops offered by other groups on campus. Our reference interactions with students who were ready to write a senior thesis indicated that they already had some background knowledge in introductory statistics. However, these conversations also indicated that undergraduates in the social sciences were not equipped with the theoretical skills to locate data and prepare it for analysis. We saw teaching these much-needed skills would fill a void in both undergraduate knowledge and instructional needs on campus.  

With this knowledge in hand, we utilized both the ACRL Framework for Information Literacy (Association of College and Research Libraries, 2015) and the Association of Public and Land-grant Universities (APLU) recommendations (AAU APLU Public Access Working Group Report and Recommendations, 2017) to develop a six-week curriculum providing a broad overview of the social science research process with respect to working with quantitative data. Students would be introduced to core social science research concepts and receive in-depth instruction on various elements of finding and working with social science datasets for their research. 

Description of the Seminar

While CUL’s typical mode of instruction is one-shot sessions for individual courses, our longer seminar operated independently of a for-credit course and reached a range of undergraduates. Due to COVID-19, the seminar was conducted via Zoom webinars. Each session consisted of a 20-30 minute presentation from a librarian and a coordinating 20-30 minute presentation by a guest speaker or panel, with time for questions. Teaching the seminar virtually enabled us to reach students who were both on-campus and remote for the fall semester.  

The first half of the series provided a comprehensive look at the research process with particular emphasis on how to choose a topic, and included a panel discussion with graduate students in various social science fields. From there, we moved onto locating social science data, and using collaborative tools for data-focused group projects. The second half of the series delved into working with data in greater detail, with sessions focused on basic concepts of data replication and FAIR principles, data citation, and codebooks. Students were exposed to a range of both general and data-specific concepts, giving them an introduction to the necessary research skills needed for their research, with a particular focus on data literacy skills.   

Because we opted to teach virtually, we utilized Canvas to provide an asynchronous component to the series in addition to the live sessions. Canvas provided us with the ability to post recordings of the sessions and make the live content accessible asynchronously, or to those who might find out about the seminar after the fact. It also enabled us to distribute the slides and other supplementary materials with relative ease. We had a limited amount of time for each session and a lot of content to cover, so creating content in Canvas, a platform that Cornell students were already familiar with, enabled us to easily provide supplementary content in a way they understood.

Assessment and Future Directions

There was a significant amount of variation between the number of registrants and actual engagement with the webinars and asynchronous materials. However, we benefitted from targeted outreach to a number of honors programs, and had a core group of students from an undergraduate research fellowship attend each week. These students used the seminar to fulfill a research requirement they were unable to meet through their normal channels because of COVID-19. In the future, we will continue our targeted outreach practices, as we found this was the best way to guarantee engagement over the course of the entire series. For librarians looking to replicate this seminar, we recommend considering which undergraduate programs have a student population that would benefit the most from the content and focus outreach on those specific groups.  

 While we found that students ultimately engaged more with the course materials asynchronously than synchronously, we appreciated that the combination of Zoom and Canvas allowed us to provide options that accommodated various learning styles and time zones. At the conclusion of our last session, we deployed an assessment survey asking participants what they had learned, which sessions they had attended, and which mode of learning they utilized or preferred. Since the response rate to our survey was low, we assessed the seminar using statistics such as our registration numbers, number of students who self-enrolled in the Canvas course, and view counts on our recordings and readings. We had a total of 98 students register to attend the live sessions, with live attendance ranging from 20 – 8 students per session, with 56 students engaging asynchronously in the Canvas course. Going forward we intend to continue offering both options (synchronous and asynchronous) to students in order to accommodate various learning styles.  

 Our future goal is to turn this into a for-credit course, allowing us to greatly expand on the amount of research methods and materials covered. We saw interest from other librarians at CUL on how they could adapt it to fit their colleges’ needs with regards to research data. There are other research methods classes taught by librarians, but we’ve seen interest in replicating our Canvas course for content that is traditionally taught only in-person. Additionally, we’ve been contacted by undergraduate honors programs outside the social sciences with requests to create similar content tailored for their students’ research projects.  


This seminar was an experiment to see if undergraduates would benefit from instruction that would otherwise have focused only on the data management skills of graduate students and faculty conducting grant-funded research. Not only did we find keen interest from undergraduates, honors program directors, and fellow librarians, we were encouraged by the usage of and interest in the Canvas course. While we initially intended to offer these sessions in-person, we feel switching to remote teaching enabled us to reach more students than we otherwise would have. The course also gave us an easy way to show faculty the type of content we can create and the format we teach it in through one simple link.  

As we continue to teach and evolve the seminar, we hope to gain further insights into the needs of undergraduate social science students who are working with data and how the library can support them in their learning and research efforts. 


AAU APLU Public Access Working Group Report and Recommendations. (2017, November 29). Retrieved December 17, 2020, from https://www.aau.edu/sites/default/files/AAU-Files/Key-Issues/Intellectual-Property/Public-Open-Access/AAU-APLU-Public-Access-Working-Group-Report.pdf

Association of College and Research Libraries. (2015, February 9). Framework for Information Literacy for Higher Education [Text]. Association of College & Research Libraries (ACRL). http://www.ala.org/acrl/standards/ilframework