Organizing your Evaluation Data: The Importance of Having a Comprehensive Data Codebook
Organizing your Evaluation Data: The Importance of Having a Comprehensive Data Codebook
By J.A. Morrow
![](https://cehhs.utk.edu/elps/wp-content/uploads/sites/9/2024/05/Jennifer-Morrow-Bio-3.jpg)
Data Cleaning Step 1: Create a Data Codebook
As some of you, know I love data cleaning. Weird I know, but I have always found it relaxing to make sure that I have all my data (or my client’s data) organized and cleaned before I start addressing the evaluation questions for a project. Many years ago, myself and my colleague, Dr. Gary Skolits, developed a 12-step method for data cleaning. Over the years we have tweaked the steps and brought on another colleague, Dr. Louis Rocconi, to refine and enhance our workshop training on this topic. One thing though that has remained consistent…and is what I believe to be the most important step…Create a Data Codebook!
Why a Data Codebook?
One of my pet peeves is a disorganized project and inconsistency in how data are organized. For every project, whether it is an evaluation research or assessment project, I start developing a data codebook before I even begin data collection. When I take on a new project from an evaluation or assessment client, I first ask for their codebook or if they don’t have one then I create it for them. Why is this so important, you ask? Think of your codebook as your organizational tool and project history all rolled into one document. It contains everything about your project and greatly aids in getting everyone on your team organized and on the same page. Your clients (and your future self) will greatly appreciate this too!
Your data codebook is a living document, it changes throughout the life of a project as you add new data, modify data, and make decisions throughout the course of the project. Not having a data codebook can lead to confusion and increase the chances of someone on your team making a mistake when analyzing data and disseminating information to your clients. Sadly, I have sat through presentations where a client points out a mistake or has a question about the data that can’t be answered by the evaluation team because they don’t have a record of what was done. Clients are never happy when this happens!
What is in a Data Codebook?
I usually include the following 9 things in my data codebooks:
- Name of the Evaluation Project
- Variable Names
- Variable Labels
- Value Labels
- Newly Created/Modified Variables (and how you created/modified these)
- Citations for Scales and Sources of Data for the Project
- Reliability of any Composite Items
- List of Datasets and Sample Size for Each
- Project Diary/Notebook
I typically put the first 7 in one table, which I create in Microsoft Word. You can also create your codebook using Excel or any other analysis software package (e.g, SPSS, R). This first table provides details about all of the data for a project. As I make any changes to the datasets, I add any new variables that I create to this table and write up my decision making for any changes in the project diary/notebook section of my codebook.
For the list of datasets and sample sizes I usually have that as a separate table at the end of my codebook. As I create a new dataset or project file I enter that information in this section of the codebook. I also include a brief description of what is contained in the new data file. I always organize this table by the most recent files first.
Lastly, I include an extensive project diary/notebook at part of my codebook. For some projects these can be very long and have many team members adding to it so I typically will have this as a document link in the codebook. The document link takes team members to an external Google document where we all can write and edit information about what we are working on for the project and what decisions were made. I cannot overstate how important it is to have a detailed project diary/notebook for an evaluation project. It is especially useful as you are writing your reports for your client about what you did and why you did something in a particular way. Anytime I have a project meeting with my team or a meeting with my client I take notes in our project notebook.
Additional Advice
So, I hope I have provided some useful tips as you start the process of organizing your evaluation data. One last piece of advice….share this codebook with your client! At the end of a project, I give the codebook (minus the project notebook as that is internal to my team) and final datasets (sanitized at some level depending on the contract) to my client so they can continue to utilize the data for their program/organization. Empower your evaluation clients to better understand their data and how their data was processed!
Resources
12 Steps of Data Cleaning Handout:
https://www.dropbox.com/scl/fi/x2bf2t0q134p0cx4kvej0/TWELVE-STEPS-OF-DATA-CLEANING-BRIEF-HANDOUT-MORROW-2017.pdf?rlkey=lfrllz3zya83qzeny6ubwzvjj&dl=0
https://datamgmtinedresearch.com/document
https://dss.princeton.edu/online_help/analysis/codebook.htm
https://ies.ed.gov/ncee/rel/regions/central/pdf/CE5.3.2-Guidelines-for-a-Codebook.pdf
https://libguides.library.kent.edu/SPSS/Codebooks
https://web.pdx.edu/~cgrd/codebk.htm
https://www.datafiles.samhsa.gov/get-help/codebooks/what-codebook
https://www.icpsr.umich.edu/web/ICPSR/cms/1983
https://www.medicine.mcgill.ca/epidemiology/joseph/pbelisle/CodebookCookbook/CodebookCookbook.pdf