Work Sample - Research and Analytics

This document has been created to showcase my research and analytics abilities. The contents of this document are based on an actual project, but all identifying information has been removed. Although the numerical data is fictitious, it accurately reflects real patterns found in the data. This document serves as a semi-formal report intended for a non-technical internal audience.

Project X’s Background

Project X was created to address the inequity gap in post-high school education between students with and without historically marginalized identities. Company A, where I currently work at, was hired by a state funder to develop and implement this project as part of a state-wide initiative to increase the number of marginalized individuals in high-paying jobs.

Research suggested that a key factor contributing to this gap was public school staff’s mindset about students of color. Accountability measures had been tried in the past, but did not result in a reduced discrepancy in enrollment rates. Company A took a different approach and developed customized data dashboards for participating schools, which contained information drawn from students surveys in each school. This allowed staff to understand their students better, form better staff-student relationships, and provide efficient career and academic advising. The company also implemented staff survey which will then lead to staff-support programs.

The first round of Project X was implemented in 2020 with 10 participating high schools and was a successful pilot. In 2022, the project received larger funding to partner with 30+ schools across the state. As a research fellow in the impact and evaluation team, I evaluated the effectiveness of the pilot, fine-tuned survey instruments, and improved the data dashboard products before implementing the project at a larger scale. The ultimate goal was to reduce the inequity gap and create a more diverse workforce, which is essential for creativity, innovation, improved decision making, and improved organization culture, all of which are incredibly important to the state.

Sample Problems

Despite receiving verbal praise for Project X, C-level executives and senior members raised concerns regarding different aspects of the project based on out clients’ feedback.

Here are examples of the identified problems and how I addressed them:

Problem 1: Low student survey completion rate in some schools (as low as 25%)
Problem 2: Staff feeling uncomfortable during the survey and quitting half-way through
Problem 3: We need quantifiable impact, beyond positive verbal feedback, to help our us and our board members understand if our resource allocation was appropriate

Problem 1: Completion Rates

For problem one, I suspect that factors such as survey lengths, school size, and students motivation are the reasons behind the low completion rates.

From the figure above, we can see that the completion rates vary between schools. With school number 4 and 9 out-performing the rest of the schools.

The most obvious factor should be survey length. Each school surveys were customized, so the number of questions per survey differ by school.

We can see that the relationship between the completion rates and the survey length is negative. Meaning that as the survey gets longer, the completion rate drops. To be exact, the correlation coefficient was -0.26, which indicate a moderate negative correlation.

The relationship is not strong, and more importantly, it is not linear. The schools that received a 12-minute student survey outperforms the schools that received a 10-minute survey. That means there are other factors that contribute to this survey completion rate.

Take a look at the following figure:

We can see that the relationship between school size (indicated by the number of students in the participating schools) and the completion rate is negative. This time, the correlation is strong, precisely at -0.73. This means that the bigger the school, the lower the completion rate. Now, I have a clearer understanding of the problem behind the completion rate.

However, the relationship is still non-linear. School 9 has 1,000 students, but the completion rate is much higher than smaller schools.

In Project X, schools can choose to implement student surveys by sending out emails to students or dedicating a portion of class time to completing the survey. It is clear that schools with high completion rates are those that implement the survey during class.

It is important to note that the size of the school correlates with the staff’s ability to implement surveys during class. The bigger the school, the harder it becomes to manage these time blocks for surveys. However, School 9 successfully achieved a completion rate of 70% with over 1,000 students.

By running a linear regression analysis, schools that implement the survey during class time have an average completion rate 20% higher than schools that send the survey out via email.

It was clear to me how important it is to implement the survey during class time; however, I noticed that School 7 had a low completion rate despite implementing the survey during class. At the same time, Schools 8 and 10 had higher completion rates despite implementing the survey via email.

With this information in mind, I conducted semi-structured interviews with staff from participating schools and found out that Schools 4 and 9 took the time to explain the significance of this survey and why it was important to be filled out. They also explained what the school planned to do in response to the survey results. However, School 7 did not do that.

Schools 8 and 9 made sure that the significance of survey participation was stated clearly not only in the body of the email but also the title of the email. So, I realized that it was the act of communicating the importance of this survey to the students that mattered most.

This finding conforms with the literature in the field that humans in an organization are less likely to participate in the survey if they don’t know how it relates to them and how the organization will take action accordingly.

Based on analyses of the different factors that can influence survey completion rates, I propose these recommendations:

Limit the survey completion length to 10-11 minutes by restructuring the survey to only include necessary questions, removing any potential vague questions that require longer time to answer, and ensuring that the design of the survey is clean and easy to navigate.
Standardize the surveys so that each school receive the same survey. If not possible, try to make minimal customization to each survey to avoid drastic differences.
Encourage all participating schools to implement surveys during class time, explain thoroughly why survey participation is important, and lay out the plans that the school has in response to the survey results.
If the school does not have the capacity to implement the survey during class, make sure to explain its significance in the email’s title and body.

After recommending these strategies, the student survey’s completion rates improved drastically during the implementation in early 2023.

The figure below illustrates the completion rate in the 2022 - 2023 project implementation:

In this round of implementation, we can see that the completion rate is relatively high, despite the number of students in each school. Therefore, Company A was able to successfully implement these strategies and improved the students’ survey completion rate.

Problem 2: Staff feeling uncomfortable

The staff were also required to fill out a 10-15 minute survey assessing their current knowledge and understanding of post-high school options and preparation. The aim was to help school leaders understand the discrepancy between staff competency and student aspirations and develop appropriate support strategies.

This problem deals with a sensitive issue as staff had the option to stop taking the survey as soon as they felt uncomfortable (this option follows the ethical guidelines for human subject research).

By examining the back-end of our survey software, I was able to track the response pattern of school staff. Although the completion rate for staff surveys was much higher than for the student survey, there was a noticeable pattern in the responses where 30% of the staff would stop taking the survey completely after seeing the questions about their knowledge. This means that we were losing many staff who could have provided insightful answers.

Our 2020 staff survey had the following structure:

Demographics –> Expectations -> Knowledge -> School Support

However, even after a person stops taking a survey, the software recorded their responses to all the previous questions, including demographic ones. I suspect that there must be a certain characteristic in the staff that stopped taking the survey, and I found a significant predictor: years of experience.

Using logistic regression analysis, I can see that the probability of continuing to take the survey instead of stopping at the knowledge question banks has a strong positive relationship with staff’s years of experience.

So, most of the staff that answered our survey were highly knowledgeable and that the school leaders might not have much to do to support them. But this is untrue. Staff with less experience were simply avoiding the knowledge questions.

I sat down with a couple of younger staff and interviewed them informally. I found that younger staff really wanted to do well in their career. They were afraid that their lack of knowledge in higher education options will hinder them from being able to accelerate in their workplace and risk being held accountable by school leaders. Although our surveys were completely anonymous, this fear stopped them from continuing to take our survey.

If schools only receive responses from knowledgeable staff, how can they plan for support programs for non-knowledgeable ones? This is something I have to solve prior to the second round of Project X implemetation.

I took it further and examined the stopping pattern between white and POC (Person-of-Color) staff.

It turns out that the strong relationship between years of experience and the probability of continuing to take the survey only holds for white staff, who make up roughly 70%-90% of all staff. For POC staff, they tend to avoid answering the knowledge questions altogether, which could be due to the fear of reprimand by school leaders. This problem will cause schools to only receive data from white staff and not from POC staff.

Based on my analysis, these are my recommendations:

School leaders MUST communicate clearly with staff that the response to this survey is completely anonymous and that there is no way of tracking the identity of the respondent.
School leaders must assure staff that their response will in no way result in any accountability measures. Our survey questions do not ask about serious mental health issues nor serious harm to students; therefore, no accountability measures shall be held against any staff.
We must restructure surveys so that they do not bring ANY amount of shame to our staff members. This includes putting the knowledge question at the end of the survey, allowing staff to express qualitatively how they want to be supported by the school before going into the knowledge questions. The new structure should be: Demographics –> school support -> expectations -> knowledge.
Re-phrase the knowledge questions so that they start with “Based on how your school support systems are… rate your knowledge in….” This will allow the staff to be more honest because their knowledge is now a result of the school support system instead of their own internal flaws.

After implementing these strategies, the proportion of staff who quit taking the survey decreased drastically in the 2022 - 23 project cycle. Take a look at the figure below:

In each school, the maximum portion of staff who quit in the middle of the survey was just under 20%. With an average quitting rate of 10%, this round of implementation showed significant improvement compared to the pilot round, where the quitting rate was at 30%.

I also found out that there was no relationship between years of experience and the likelihood of continuing the take the survey or quitting. There was also no relationship between being white or non-white in the choice of action. This means that the proportion of staff who quit the survey do not hold specific demographics that could bias our results. They were truly random.

Problem 3: Quantifiable Impact

The positive verbal feedback came from many participating staff and school leaders, as well as some students with whom our senior partners interacted. This feedback indicated an improvement in staff-student relationships within schools, which is a good thing. It means we are on the right track and must have done something right.

However, the ultimate goal of Project X goes beyond improving staff-student relationships within schools. One of the most important intended impacts of this project is to decrease the discrepancy in post-high school education enrollment rates between white and non-white students. A secondary goal was to increase these enrollment rates overall.

Since the project was implemented in 2020 and the participating cohort included 11th- and 12th-grade students, I approached a state agency to obtain the 2021 and 2022 post-high school enrollment rates for these participating schools.

But we will never know the true impact of the project if we do not have a comparison group. So I obtained the same data for 10 non-participating schools with similar demographic and regional characteristics. This will allow us to draw causal conclusions. In statistics, we call this method “interrupted time series analysis”.

When we think about post-high school education, many of us automatically think of “4-year college”. This is reasonable because it is traditionally the most robust post-high school option that allows students to secure livable wages upon graduation.

Looking at the visual above, we can see that the 4-year college enrollment rate declined on average between 2020 and 2022. However, it is clear that the rate gap between white and POC students for participating schools diminished in 2021 and 2022, while the rate gap for non-participating schools remained relatively the same. This means that Project X was successful in our primary objective, which is to close this rate gap.

But the secondary aim of this project was also to promote post-high school education, and we expected the enrollment rate to go up. However, this was not the case in the data.

There are some potential explanations here. Firstly, the Covid-19 outbreak certainly impacted students’ decisions to attend or defer their first year of college. Secondly, and more importantly, I suspect that a 4-year college might not be the gold standard for livable wage credentials anymore.

In the United States, where higher education is incredibly expensive, many individuals started to change their mindset about obtaining 4-year college degrees. Throughout the past decade, online learning, trade programs, apprenticeship, or associate degrees have allowed many people to land high-paying jobs. This could be because society now places less importance on credentials but more on demonstrated skills.

Therefore, I pulled enrollment data on other types of post-high school options from the same data source to create a new “enrollment rate” variable. This variable is a combination of 4-year college and all other non-traditional post-high school options.

Now we can see that the total enrollment rates actually went up for both white and POC students in the participating schools. The rate gap is also smaller. This mean that Project X succeeded in increasing the total post-high school education enrollment rate, with an exception of the Covid year 2021.

This achievement is considered a substantial increase in the field, with expectations of reaching 35% - 45% state-wide by 2030. The quantitative analysis proves Project X made a difference, and qualitative feedback supports our approach of nurturing staff-student relationships.

Despite some flaws in the survey process, the pilot round demonstrated substantial quantifiable impact in the schools. This could mean that staff’s willingness to build a more nurturing relationship with their students depended upon other products and activities in the project to a higher degree than we expected. We originally thought that the student and staff survey would be the most important aspects of this project, but the results said otherwise.

I begged the question of whether the time and budget spent on structuring the survey should be greatly reduced for the new project cycle. The answer is no, not to that extent. This is because I found a correlation between the mean total enrollment rate increase and students’ survey response rate, holding school size constant. The correlation index is strong (0.8), but the magnitude is small. I concluded that the student survey is an important part of our project, but other aspects are definitely more important than we expected.

To make the 2022 - 2023 round of implementation even more impactful, I proposed that:

We emphasize less on 4-year college enrollment and encourage schools to further explore other post-high school options that still creates revenue to the public workforce (such as trade programs, associate degrees, apprenticeship, foreign exchange, etc.).
Just like how we put effort into the survey and dashboard constructions (our products), we should put as much effort (if not more) into the act of facilitating staff-student relationship (such as listening sessions, coaching capacities, and in-person programming). As the relationship is key to the intended impact of the project.

We have yet to fully complete the 2022 - 2023 cycle, but we expect a higher magnitude of impact in this round, as our survey implementation was much more robust and our effort for facilitating organizational relationship increased.