ESM: Building Blocks for a Data Science Career
ESM: Building Blocks for a Data Science Career
By Anthony Schmidt
![](https://cehhs.utk.edu/elps/wp-content/uploads/sites/9/2024/07/Screenshot-2024-07-11-145353.png)
When I began the ESM program in 2018, I was unsure of the career path I would follow. I knew I wanted to do “research” on something related to education, but I was unsure of what that was. As I went through the program, I naturally began to focus more and more on quantitative skills (e.g., statistics, psychometrics, programming). Little did I know at the time, but these skills, as well as the general research, qualitative, and “soft” skills I was gaining, prepared me to be an excellent candidate as an educational data scientist within the EdTech industry.
I have been a data scientist at Amplify, an EdTech company that publishes curriculum products and offers an online teaching and learning platform, for nearly three years. The term data science, while a ubiquitous term and job title, is unfortunately a vague concept. It can mean a variety of different things, from basic descriptive data analyses to complex machine learning development operations. It spans an entire continuum that represents data from end-to-end – from its generation in various applications, assessments, or surveys all the way to its consumption in statistical reports, business intelligence dashboards (made in applications like Tableau or PowerBI), or fraud alerts.
In my time as a data scientist, I have performed many roles along this continuum. On any given day, I may be in meetings that involve new product features and the data that will be generated from them, and how best to extract that data and create useful data warehouse tables. I may be advising other teams on how best to use our data to build teacher-facing reports on student learning. I may be building a model in SQL that will deliver data to a dashboard used by customer account representatives who need to understand a district’s usage of a particular product. Or I may be using R to analyze millions of rows of performance data to understand patterns of learning through complex multilevel models or psychometrics. As a data scientist, my role is to be an expert in the data at any point in its lifecycle. If this sounds exciting – it is!
From ESM to DS
The ESM program helped me move into a career in data science by building three broad areas of competency: technical skills, domain knowledge, and power skills.
In terms of technical skills, becoming proficient in R was a key competency that helped me land a job in EdTech. R is the language of statistics and one of the key languages of data science (alongside Python and SQL). During my time in the ESM program, I became what I would describe as an advanced user of R. I not only knew how to run individual statistical analyses but built up skills in functional programming (e.g., writing functions to implement DRY [don’t repeat yourself] principles), literate programming (e.g., using R Markdown to build automatic reports, my CV, and even my dissertation [Github link; TRACE link]!), software development principles (such as use of git), and even package development.
Before my ESM courses, I was not a programmer in any sense. I dabbled in some HTML and CSS as a teenager, but mostly through WYSIWYG-based (“what you see is what you get”) development environments. I can point to Statistics in Applied Fields III as the course where I began taking programming more seriously. In particular, Multilevel Modeling and Advanced Measurement (all of which were R-based) were where I really leveled up my skills, and then various internships and projects (including my portfolio and dissertation) forced me to upskill even more. One area I particularly enjoyed was building advanced data visualizations using the ggplot2 package. This led to various research opportunities, a pretty cool poster presentation related to data viz on Twitter, and even a career as a data visualization designer prior to becoming a data scientist.
Becoming an advanced user of R built up a mental schema that made any data-based project easy to tackle, as I had a large technical toolset from which to draw. It also made learning new R-based frameworks easy, such as Tidymodels for machine learning or Plumber for API deployment. Furthermore, it provided a foundation for learning additional computer languages, including SQL and Python.
While programming skills like these are important in data science, it is not enough. You also need to possess what I am broadly referring to as domain knowledge. This category encompasses the quantitative domain, the research domain, and the education domain.
What often sets a data scientist apart from a data analyst is the quantitative methodological skills that the data scientist brings to the table. We are tasked with not only describing data but inferring complex relationships from it. Having domain knowledge in quantitative methods is a key competency for data science. We are often asked to use various methods to examine relationships, make inferences, and sometimes establish causal relationships (often in the form of A/B tests). Having a solid foundation in regression techniques (e.g., OLS, logistic, multilevel) facilitates this. Furthermore, this foundation also makes learning new techniques to help answer questions or solve problems much easier. For instance, I did not take any courses on generalized linear models (beyond logistic regression), machine learning, or sentiment analysis, but I have had to use all of the methods. Learning to do so was easier because of the foundational quantitative skills I learned in my ESM course, especially the multilevel modeling course.
A related but separate domain is “research” – being able to design a research project (whether that is observational, survey, experimental etc.) and understand when to employ quantitative vs qualitative techniques is also a much sought after skill. I am in many meetings where I have to think through the best way to collect data in order to answer questions (i.e., do research). Sometimes, this also involves suggesting qualitative ideas to our user experience researchers or working with them on mixed methods approaches. So, while having a quantitative background is extremely useful, having general research methods skills helps to place quantitative research within a more purposeful context that solves business problems or answers strategic business questions.
While not applicable to all data science roles, having a background in education also certainly helps in the world of EdTech. I came to the ESM program with a background in language instruction (TESOL) and about 10 years of teaching experience. That helped establish a mental context in which I could apply real or hypothetical research projects. Many of our courses, readings, and assignments were also contextualized within education, whether that was K-12, higher education, or adult education. All of these experiences translate into helping ground my understanding of my company’s data into a familiar context, one in which I can explain teacher and student actions in terms of pedagogy, theory, and practical experience. Even if you have no prior experience in education, the ESM program offers numerous opportunities to learn about and research a variety of educational contexts.
Throughout the ESM program, we are steeped in an environment where we need to employ power skills, also often referred to as “soft” skills. I often work on cross-functional teams that comprise myself and people from engineering, product managers, or content authors. These are what we might consider non-technical stakeholders in various projects. Being able to pitch ideas, understand requirements, or translate complex analyses into audience-friendly terminology is essential. These tasks directly reflect the group work and presentations we often had to complete in ESM courses, as well as the series of required program evaluation courses. While I am not an evaluator and I don’t work in an evaluation setting, the skills I learned in these courses, particularly Program Evaluation III, are essential for working with various stakeholders in these cross-functional groups.
Finally, one skill we often take for granted is being a “fast learner”. It is an absolute requirement in any job setting, and no less true for working in data science. Being a graduate student is nothing if not an exercise in 4+ years of being a fast learner. It is something that should be emphasized in any interview. You are never going to know everything, but your experience as a graduate student demonstrates that you have the ability to learn, quickly, and often in a fast-paced environment – a perfect description of EdTech.
Advice for Aspiring Data Scientists
To wrap up this blog post, I would like to offer some basic advice for those interested in a career in (educational) data science. First, I’d recommend completing as many quantitative courses as possible both inside and outside of the ESM program. If you don’t see something you want to learn being taught, I’d recommend working with a professor and learning those skills for credit as part of an independent study. I’d also look into the educational data science graduate certificate that UTK offers.
I would also recommend doing a search on Google Scholar – both journal articles and dissertations – to understand the landscape of data science research within education. This can help you frame various projects, inspire your own dissertation, or identify methodological areas you would like to learn about.
Finally, I would strongly recommend finishing your PhD program with a solid background in R and intermediate levels of proficiency in SQL. If you can add in Python, that will make you an even stronger candidate. Take advantage of LinkedIn learning (that is how I learned SQL) while you have it!
I hope that my blog post has given you some insight into how I have translated my ESM skills into a career as an educational data scientist. Feel free to reach out to me anytime with questions related to ESM or job hunting in EdTech. You can find my latest contact info and CV information here: https://www.anthonyschmidt.co/.
Good luck!
Additional Resources (beyond ESM courses and your professors!)
- LinkedIn Learning (available through UTK) for learning R, Python, SQL, and ML
- SQL Exercises – I used these to prepare for several DS interviews
- bnomial Daily ML questions