0% found this document useful (0 votes)
59 views4 pages

Katz Isec2023 Matching Submit Rev031023

Uploaded by

api-669099824
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views4 pages

Katz Isec2023 Matching Submit Rev031023

Uploaded by

api-669099824
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Toward a More Equitable and Effective Process for

Student-Mentor Cohort Assignment


Carah Katz, Martha Cervantes, and William Gray-Roncal
Johns Hopkins University Applied Physics Laboratory
{carah.katz, martha.cervantes, william.gray.roncal}@jhuapl.edu

Abstract – We host an internship program for matches because the complexity makes it challenging for
trailblazing undergraduate students, by providing humans to effectively explore options.
intensive mentoring and the opportunity to participate We briefly note a few challenges, including the
in cutting-edge research. Students have the opportunity importance of considering cohort dynamics and diversity;
to build skills and demonstrate leadership for science the potential for rankings containing incomplete
and engineering careers. Many of our interns have a preferences; the disconnects between student expertise and
passion and capability for their projects, but lack the student enthusiasm; and the combination of skills and
necessary expertise at the beginning of the program. expertise required to successfully complete a project.
Mentors are excited to support students and leverage Finally, we note that we typically match students post-
their enthusiasm, but simultaneously must achieve their admission, so in our problem context, it is important that
promised research results. Because this program is each student receive an acceptable match, even if it results
cohort-based, when considering the assignment between in an imperfect (although still acceptable) match for another
projects and student subgroups, the relationships and student.
shared capabilities within a cohort are more important
I. Gale-Shapley Matching Algorithm
than the capability and match of a single student.
This paper presents possible solutions to enhance Gale-Shapley is perhaps the simplest and best-known form
current matching algorithms and offers tunable ways to of the stable matching problem. This is often presented
achieve satisfactory matches throughout the overall through an analogy of marriage stability, aptly named the
program. Current algorithms often do not consider Stable Marriage problem (SM): two sets of people, four
factors beyond summarized, ranked choices. We propose bachelors (proposers, men) and four bachelorettes
a solution that considers more complex parameter (proposees, women) each have their own respective
spaces and interactions within groups. This requires an preferences of each other. They must be matched to one
approach that considers different weights for attributes another in such a way that all parties have been paired with
such as skill and interest, as well as parameters to adjust their highest possible choice (i.e., no two people prefer each
to the needs of a particular program session. This other over their current matches); we define this as a stable
algorithm has the capability to be applied beyond our match. Not every individual will be paired to their top
specific problem setting due to the flexibility and degrees choice partner in most scenarios [1].
of freedom provided; for example, such an algorithm The algorithm does not favor men and women equally
could also be applied to K-12 enrichment programs and has a proposer bias, which we define as follows: An
focused on student interest and the exploration of new algorithm’s preference for the proposer instead of the
topics. proposed-to; the resulting output favors the proposer and
their best interests as opposed to offering both groups equal
Index Terms – matching, diversity and inclusion, internship, preference.
industry partnerships The Gale Shapley system allows the men to select from
any of the present women regardless of their status as
INTRODUCTION matched or unmatched, but the women may only select from
The CIRCUIT Program seeks to match high potential those who have already proposed to them. That is, the men
undergraduate students in cohorts to mentors and their have all options available to them, while the women have
projects using an algorithmic approach. We initially used only one option at a time, regardless of their own
the Gale-Shapley matching algorithm because of its long preferences [2]. Gale-Shapley’s algorithm has more
history of success in similar problem settings; however, we applications than its basis in SM; other use cases including
found that our problem space differed in several core college admissions and hospital residency [3]. A noteworthy
assumptions and we frequently had to craft a bespoke attribute of the Gale-Shapley algorithm is that it is widely
solution from raw algorithm outputs which differed from the agreed to be an efficient solution to the stable matching
optimal solution. This is time consuming, breaks optimality, problem. Proposer bias is not inherently a bad thing, and can
and can often result in a poor solution for one or more be used to the advantage of various coordinators and
programs depending on the problem setting.
II. Irving’s Matching Algorithm preferences not of specific projects but of specific project-
types and projects have preferences not of specific interns
Gale-Shapley is not the only existing matching solution.
but of specific intern-skills. This makes the connection
The second most commonly applied solution is Irving’s
between interns and projects implicit as opposed to explicit.
algorithm, an algorithmic solution to the matching problem
deriving from Gale-Shapley’s original solution. The
difference between the two algorithms [4], is seen in
variables, use cases, and procedural steps. Gale-Shapley’s
algorithm is best suited for cases where there are two groups
to consider. Irving’s algorithm is best suited for cases where
there are three groups to consider; examples include
assigning students to mentors and projects (Student-Project
Allocation (SPA)) and assigning dorming students to
roommates and dorm buildings (Stable Roommates Problem
(SR)) [3,4]. While Gale-Shapley’s algorithm matches
individuals across two different sets, Irving’s algorithm
matches individuals of the same set [5]. FIGURE I
PROBLEM SPACE AND PROPOSED SOLUTION THAT INCORPORATES MULTIPLE
Through this use of single set matching as opposed to FACTORS AND GROUP DYNAMICS
two-set matching, all individuals have all potential options
from their preference list at their disposal to propose to, Our proposed solution derives from an approach that
effectively eliminating the proposer bias inherent to Gale- considers the incorporation of descriptive lists [6], which we
Shapley’s original solution. adapt to consider the potential to receive incomplete
METHODS datasets as described in our setting. In the situation that an
incomplete dataset is received, we propose, with inspiration
Gale-Shapley has been helpful thus far as an alternative to from prior work [6], that missing data points can be
manually matching interns and projects in CIRCUIT. predicted algorithmically based on the existing data. In our
However, it is limited in its ability to incorporate soft and example, this would manifest as estimated intern
hard constraints and consider solutions that are likely to preferences for unranked projects. By running the matching
balance the goals and desires of both projects and students. algorithm with a now completed dataset, we lessen the
The result of this is that the optimal solution includes chance for interns to be output as unmatched [6]. To
pairings which cannot be implemented realistically; for incorporate this information, interns are required to provide
example, this might include friction in relationships between not preferences over specific projects but over project types
interns or insufficient cohort skills to complete a project. (e.g., public health surveillance, spacecraft ground system
Adjusting GS solutions manually to correct high-risk development). Similarly, projects would provide preferences
scenarios, such as those introduced by this lack of ability to not over specific interns but over desired intern skills (e.g.,
implement hard or soft constraints, means that we still have Python programming).
not been able to fully automate the matching process despite In regards to cohort-constraints, we recognize that it
the use of Gale-Shapley’s proven mathematically optimal may be necessary for two interns to be prohibited from
solution. working in the same cohort (for example, if there were a
Through testing Gale-Shapley and Irving algorithms in conflict of interests). To avoid the need to reorder the
synthetic datasets, we have been able to better understand output solution by hand, the algorithm must recognize that
the algorithms in the context of our specific problem these pairings are considered unstable or prohibited in order
statement. Gale-Shapley’s algorithm functions in a two-set to avoid impossible pairings in the output. Incorporating
matching game where the two sets explicitly have these constraints in practice may add complexity and in
preferences of one another; Irving’s algorithm functions in a some cases may make the output of a stable output
single set matching game where each individual has explicit impossible.
preferences of one another. Our matching problem is such Additionally, we propose the potential implementation
that the two sets (interns and projects) have complex of soft constraints as a possible way to consider group
preferences, and these may be further constrained by group dynamics when matching, not just thinking about each
interactions that are implicit in rankings, but not initially person as a single match, but thinking about the cohorts
evident prior to matching. Because interns are already holistically, as an integrated team. Soft constraints would
accepted into the program and projects require specific act as rules which say that while a pairing is not absolutely
skillsets, our problem can be illustrated visually via the prohibited, it is also not a favorable pairing. For example,
following model (Figure 1). We show that interns have this could avoid a three-person cohort being formed which
subset-skills (e.g., Python programming) and projects have consists of three leader-type personalities who may struggle
subset-types (e.g., public health surveillance, spacecraft as a result of their competing ideas in favor of a three-
ground system development) but that interns have person cohort being formed which contains only one leader-
type personality. Another example would be to avoid a 3/1 5/5 2/1 3/1 1/3 5/2
scenario where a cohort has no one proficient in Python,
when at least one is necessary for project success – despite Regardless of the algorithm applied or which set is chosen
each candidate being an acceptable match individually. to act as the given input’s proposing set, the results are the
same under the conditions that there are an equal number of
RESULTS individuals in each set (projects, interns).
I. Individual Matching
In this section, we briefly illustrate the results of our II. Cohort Matching
approach using synthetic data consisting of interns,
In this section, we briefly illustrate the results of our
project preferences, projects, and intern preferences.
approach using random inputs to create synthetic data
The rules state that all interns must be matched to 1
consisting of interns, project preferences, projects, and
project; all projects must be matched to 1 intern. The
intern preferences.
results are formatted as:
Our rules state that projects can take UP TO 2 interns, each
Type of algorithm (Proposing set)
project must have at LEAST 1 intern. The results are
Project:Intern
formatted as:
Project’s preference of intern / Intern’s preference of
Type of algorithm (Proposing set)
project
Project:Intern1, Intern2 (if there exists)
Project’s preference of Intern1, of Intern2/Intern1’s
TABLE I preference of project, Intern2’s preference of project
INTERNS’ RANKED PREFERENCES OF PROJECTS
TABLE III
Intern 1 2 3 4 5 6 INTERNS’ RANKED PREFERENCES OF PROJECTS

A Red Blue Yellow Green Orange Purple


Intern Pref 1 2 3 4

B Green Blue Orange Yellow Red Purple


A Red Orange Yellow Green

C Yellow Green Red Purple Blue Orange


B Green Red Orange Yellow

D Green Yellow Blue Red Purple Orange


C Yellow Green Red Orange

E Orange Purple Red Green Yellow Blue


D Green Yellow Orange Red

F Orange Blue Green Yellow Purple Red


E Orange Yellow Red Green

F Orange Red Green Yellow


TABLE 2
PROJECTS’ RANKED PREFERENCES OF INTERNS
TABLE IV
Project Pref 1 2 3 4 5 6 PROJECTS’ RANKED PREFERENCES OF INTERNS

Red E F C D B A Project Pref 1 2 3 4 5 6

Orange A C E F D B Red E F C D B A

Yellow A D F C E B Orange A C E F D B

Green C A F E D B Yellow A D F C E B

Blue D F C E A B Green C A F E D B

Purple A F D B E C
Cohort Matching Game Results
GS (Interns = Proposers)
Individual Matching Game Results
GS (Interns = Proposers) – same solution Red: A Orange:E, F Yellow:C Green:D, B
GS (Projects = Proposers) – same solution 6/1 3,4 / 1,1 4/1 5,6 / 1,1

A:Yellow B:Red C:Green D:Blue E:Orange F:Purple


GS (Projects = Proposers)
Red: F, B Orange:A, E Yellow:C Green:D [6] Morfe, K. C., Agustin, U., Blanco, M. C., Cortez, D. M., & Alipio, A.
(2022). Enhancement of Irving's algorithm with autocomplete feature.
2,5 / 2,2 1,3 / 2,1 4/1 5/1 International Journal of Research Publication, 99(1), 77-86.
https://doi.org/10.47119/IJRP100981420223061
In this simulation, we show that interns receive their first-
choice project (which will not always be true, especially
under complex constraints and preferences) when they are
the proposer. As we incorporate more complicated
constraints, we have a framework to develop and test
solutions for a variety of use cases.
DISCUSSION AUTHOR INFORMATION

We have considered various matching techniques, and allow Carah Katz, high school senior at Long Reach High
for student cohort to mentor group matching to consider School, Columbia, Maryland, and Intern at the Johns
many factors beyond just student and mentor individual Hopkins University Applied Physics Laboratory. Aspiring
preferences. We envision adapting this approach to applied math and cognitive science student.
explicitly consider skill and enthusiasm, as well as other
attributes that may be desirable for a particular project or Martha Cervantes, Mechanical Engineer and CIRCUIT
team. We note that many mentors are willing to host an Project Manager at the Johns Hopkins University Applied
enthusiastic, but junior student, as long as they are Physics Laboratory.
surrounded by peers with more experience; we hope to
implement this approach in a way that honors hard and soft William Gray-Roncal, PhD, Electrical Engineer, CIRCUIT
constraints and encourages the behaviors we aspire to in an Program Lead, and Group Supervisor at the Johns Hopkins
automated approach. University Applied Physics Laboratory.

Coding and validating this approach and considering the


qualitative and quantitative tradeoffs, especially on real-
data, are important areas of upcoming future work. We note
that this approach is unlikely to create stable solutions as
Gale- Shapley does, and so users should consider this as a
method to find satisfactory solutions, with the additional
benefits of a tunable framework.
ACKNOWLEDGMENT
We would like to acknowledge the CIRCUIT and ASPIRE
programs at the Johns Hopkins University Applied Physics
Laboratory, the JHU/APL Sabbatical Program, and Ms.
Beth Dungey at Long Reach High School, Columbia,
Maryland.
REFERENCES
[1] Gale, D., & Shapley, L. S. (1962). College admissions and the
stability of marriage. The American Mathematical Monthly, 69(1), 9-
15. https://apps.dtic.mil/sti/pdfs/AD0251958.pdf

[2] Leios Labs. (2018, February 12). The stable marriage problem (Gale-
Shapley algorithm) [Video]. YouTube.
https://www.youtube.com/watch?v=A7xRZQAQU8s

[3] The Matching library developers. (2020). Matching library (Version


1.4) [Computer software]. doi: 10.5281/zenodo.4244776

[4] Abraham, D. J., Irving, R. W., & Manlove, D. F. (2007). Two


algorithms for the student-project allocation problem (Technical
Report No. 1). Journal of Discrete Algorithms.
http://dx.doi.org/10.1016/j.jda.2006.03.006

[5] Yeh, J., & Dillion. (2014, December 4). Irving's algorithm and stable
roommates problem [Video]. YouTube.
https://www.youtube.com/watch?v=5QLxAp8mRKo

You might also like