Improving Efficiency of Project Matching for educators and students

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Meet the Team

James Parkington

Matching Algorithm Engineer, Web Role Secuity Engineer, Portfolio Developer

Rory Nicholas

Spokesperson, Portfolio Developer, Front-end Developer

Teodora Lovin

Team Lead, System and Database Architect, Client Liaison

Project Abstract

Universities around the world have the annual task of allocating each of their final-year students a supervisor. Many of these universities, including University College London 'UCL', still sift through large spreadsheets manually looking for matches between students and supervisors. UCL also did not have a standardized way to facilitate communication between prospective supervisors and students.

The team of students at UCL partnered with Microsoft to showcase how a problem like this one could be elegantly solved using their Dynamics 365 products.

Technologies

Our project is powered by the follwing Microsoft technologies, allowing us to create a tailored solution to the problem we've been presented while incorporating smooth integration with exisiting UCL infrastructure.

Dataverse

Dataverse lets you securely store and manage data that's used by business applications. Standard and custom tables within Dataverse provide a secure and cloud-based storage option for your data.

JavaScript

JavaScript is used to develop interactive web applications. JavaScript can power featured like interactive images, carousels, and forms.

Power Automate

Power Automate is a service that helps you create automated workflows between your favorite apps and services to synchronize files, get notifications, collect data, and more.

Power Apps

Power Apps is a suite of apps, services, and connectors, as well as a data platform, that provides a rapid development environment to build custom apps for your business needs.

Demo Video

We've produced a short video outlining the background of our project, how we went about solving it, and a demonstration of some key features of the finished project. You'll find this video below, however a higher quality vesion is available for viewing here.

final-8min from Rory Nicholas on Vimeo.

Project Background and Client Introduction

UCL’s Computer Science department has recently been transitioning several day-to-day activities in the form of business transformation as workflows. In partnership with Microsoft, several tools and systems have been requested across the teaching and learning domains.

The highest burden faced by our client, Dean Mohamedally, is the Final Year Project Supervisor Allocation process which is being done manually. This involves finding a supervisor who would be good a match for each of 160 students in the class, which is a time-consuming process.

The Current Solution

The current solution is not fit for purpose and involves multiple stages of spreadsheets, conflicting requests, and lack of timelines.

The process should allow a first-round survey to all final year students, to state their degree level, and whether they intend to do research only, industry only, or don’t know as their profile option.

Visibility of projects is a key ask for the academics - this includes research projects listing, ACM classification keywords of the academics, and visibility of the industry projects lists once they are ready. A solution that enables staff workloads to be input is needed at this point as those academics that have capacity to supervise and are not taken by students for their research projects, should then automatically make themselves available for projects with industry.

The second-round survey is the matching of the students to the research projects. This needs timeframes to be set up that mark key dates: when projects are advertised, when the projects are taken by a student, or if a student is in discussions with an academic for a project. Only once this is facilitated, it will be then signed off as allocated as FYP for Research Academic Projects.

After that stage, or in parallel, for students that have said yes to wanting industry projects on the UCL IXN, they will fill in a separate second round survey, called the UCL Motivation Tracking Survey. This data then profiles capabilities and allows a Teaching and Learning member of the team to import the student grades for the IXN and Strategic Alliance Teams to begin sourcing appropriate projects in those quantities.

By March this data is completed and those who said they did not know will be once again advertised to come forward to choose in a 3rd cycle of survey forms.

Academics need a listing that enables them to see, at a glance, who they are supervising as final year projects are allocated.

Easy-to-use interface for staff, students and administrators
Bring all steps of the FYP allocation process to one place
Store all relevant data in one location
Facilitate unsupervised matching
Automate supervised matching

At the start of our project, we interviewed 4 students, 2 academics, and the administrator who is overseeing the FYP supervisor allocations this year.

We decided to go with a semi-structured interview as the method of gathering information from would be users. We probed topics such as what they would like to be able to do with a new system and thoughts they had about the current system. We deemed this more appropriate than a questionnaire as people are more reluctant to be thorough and descriptive in questionnaires. We felt that it was also appropriate to ask open interview questions to really understand what the users want.

Below are notable points from answers to the questions we asked.

When you went through the process of finding a supervisor for your Final Year Project, what was your impression of the process?

The process struck me as unorganized. I had trouble finding the place to look for the list of supervisors – it turned out it was on the systems engineering Moodle page.
I had a difficult time reading through the list of available supervisor options. A code was used to list their ACM keywords, so I had to look up the code every time. I wish it was easier to browse the list of supervisors. Also, my device had a glitch with the file that listed all the ACM keywords, which made browsing more difficult.
The process itself was okay, though I would’ve preferred a separate Moodle page for it and an email that would notify everyone simply about the deadlines.

What do you think about the process as described by the diagram below?

It would be great to have everything happen through one medium. I managed it this year, but it would’ve obviously been easier and cleaner if everything went through a single medium, rather than many strewn forms. I suspect the process would’ve been more on-schedule as well if the administrators didn’t have to manage all these forms and mediums of communication.
If possible, that sounds great! But it should certainly be linked with UCL so I don’t have to remember more credentials. I also want it to be easily memorable and accessible so I don’t keep forgetting what the website’s called or where to find the link.

Would you prefer it if there was a website through which every (or most) of these steps would happen?

It would be great to have everything happen through one medium. I managed it this year, but it would’ve obviously been easier and cleaner if everything went through a single medium, rather than many strewn forms. I suspect the process would’ve been more on-schedule as well if the administrators didn’t have to manage all these forms and mediums of communication.
If possible, that sounds great! But it should certainly be linked with UCL so I don’t have to remember more credentials. I also want it to be easily memorable and accessible so I don’t keep forgetting what the website’s called or where to find the link.

What do you think about the current method of FYP collection?

On average, I have had success with it. I usually fill most of my spots during unsupervised allocation, which is nice because I get to pick who to supervise.
It’s alright, this year they sent out a new, well-formatted form for us to register as supervisors. Despite the fact it was a bit long, I don’t have trouble with it. I have found some great students to supervise this way. The only issues I've ever had were with students who were allocated to me by the administrator, because sometimes we had little in common and I felt I couldn’t provide the best advice on their dissertations.

What is most important to you to convey to potential students?

I want to convey my research groups and link students to my previous works. In previous years we were asked to provide our ACM keywords, and I think that’s also a good metric. Some think it’s outdated but I think in this case, it is fit-for-purpose.
I like to leave a customized message because I usually have specific projects in mind and want students who are interested in them to contact me directly.
I want to convey I am only available to supervise industry projects because I am not a researcher.

What have been the greatest pain points in administrating this process?

It can take a lot of time to read through students’ answers and the available staff and try to match them such that everyone is happy. There are a lot of dimensions to consider, so this can be quite a taxing task. There are also some students who don’t find a supervisor on their own and then don’t fill the form in, so I have to chase them down. Sometimes there are also students and academics who agreed to work together but didn’t tell us, and this causes confusion and a lot of back and forth. On top of all of this, I need to orchestrate the entire process, from preparing forms, to getting that data out in a readable format, to making sure students and staff are on board and well informed. It’s generally a lot of work.

What are you most looking forward to getting out of this project?

I am excited to not have to spend as much time organizing each step and doing everything by hand. It took a very long time to come up with a document for students' supervisor options. I had to create a list, too, for them to see the ACM keywords. It took a very long time. If the system can handle displaying the correct inputs and outputs of data for each staff and student, that would already cut down on a big task. The automatic matching is very useful too - I could free up time to spend it better dealing with outliers or special cases. I also think this could be replicated and used in many other situations and potentially by other institutions to solve similar problems.

What functionalities do you see as essential?

I want it to be possible for professors to agree with students to work together before doing any automatic matching. Please also make sure the administrator knows of all such pairs made. I want to use ACM keywords to implement the matches. Academics should be asked to select their ACM keywords, and I also want it to be possible for students to select some ACM keywords they want to be matched by. This is particularly for students interested in research. A page where the students can just type a link and see all the available supervisor options would be great.

Must-have
1. Academics can be manually and automatically allocated to students based on common interests.
2. Academics can report the research groups they are a part of and ACM keywords that their work falls under when registering to supervise.
3. Academics who have agreed on a project with a student can soft book them through the system.
4. Administrators of the process must be able to approve of reject matches made by the system and any pre-allocation soft bookings made by users.
Should-have
1. Academics hosting students annually can easily update and resubmit their previous application.
2. Students can view available supervisors through a searchable list or table.
Could-have
1. Filled applications are editable unless the administrator opts to make them read-only.
2. The solution monitors student interest over many years.
3. The solution automatically gathers supervisors' ACM keywords from published works.
Won't have
1. The solution will not concern the allocation of industry students to industry projects.

Implementation

Web Roles are very important to our solution as they dictate what any given user can and can't see/do on our portal. There are three roles available to a user - Student, Supervisor, and Administrator. Administrator roles should be assigned by the department using the Portal Management section of our solution. The other two roles, Student and Supervisor are chosen by the user when they log in for the first time.

When a user first signs in on the portal's landing page, they are presented with this screen. After using the radio buttons and submission button to select their role, they are taken to their chosen role's respective home page. They will not have another opportunity to change this and must instead ask an administrator to change it for them.

After clicking the above button, the following code will run, triggering the flow that will assign the correct web role to the current user based on which radio button they selected.

After a user has selected their role, whenever they next sign in, they do not see this role selected screen and are instead directed to their home page, as mentioned before. This is the code that does so.

These roles are used to stop a user from seeing content that they should not be able to. Similar to the above code, liquid is used on each of our Portal's web pages to check what roles are assigned to the current users contact. If they do not have sufficient roles, they are prompted to return to the sign in page.

Flows are pieces of logic which allow some level of automation to otherwise static objects, such as tables. Flows can be triggered by several things, like the manual press of a button, the modification of a particular table, or the arrival of a certain time of day. They can be used to do things like edit data in a table or send an email. Below are some of the most important ways in which we used flows:

Add Contact Field to Supervisor ACM
This flow is triggered by the addition/modification/deletion of a record inside Supervisor Responses v2. When a Supervisor Application is submitted/modified/deleted, a record is created inside Supervisor-ACM and the ID, ACM Keyword, and Supervisor Response ID fields are populated in that record. This flow is used to populate the remaining Contact field of this new record. The flow does this by extracting the Contact field from the newly modified Supervisor Response record and inserting it into the Supervisor-ACM records belonging to that supervisor.
Create Student-ACM Record

This flow is triggered by the addition/modification/deletion of a record inside Student Responses. When a Student Application is submitted/modified/deleted, a record is created inside Student Responses. We also want to create records inside Student-ACM, populating the columns ACM Keyword and Contact. The First-choice Interest field is extracted from the submitted record, and this is mapped to ACM Keywords using the Topic-ACM Maps. A record is created for the student associating them with each of these ACM keywords.
Insert Topic-ACM Records
This flow is triggered manually, and was created for the purpose of efficiently populating the lookup table which maps student topics to ACM keywords. It works by extracting all ACM keywords pertaining to a certain section and creating a row in the lookup table mapping the keyword to a particular student topic. For example, it may select all ACM keywords beginning with '2.1', and create a record mapping them to 'Algorithms and CS Core Theory'.
Assign Web Role to Contact
This flow is triggered when a user signs in on the Portal for the first time and selects a role. The flow is expecting a request with a JSON payload of a specific shape. This payload contains the id of the currently signed-in user, along with the id of the web role that they just selected. The flow then relates the record of the currently signed-in user in the Contact table with the record of the selected web role in the Web Roles Table. Doing this gives the permissions of that web role to the user.
Update Student Responses Availability
This is the longest and most complex flow in the solution and is triggered by the addition/modification of record inside the Supervisor-Student Pairs table. When a pair record is added or editted in this table, there are two things that may happen depending on the value of the 'Approved' field in the record in focus.

If this has a value of 'Rejected,' meaning the supervisor didn't approve the match, the student response of that match has it's 'Availability' field set to 'Yes.' Then, if the matches supervisor has a remaining capacity less than the actual capacity, the supervior's 'Availability' is set to 'Yes' as well.

If instead the value is 'Approved,' if the matches supervisor's remaining capacity is equal to 0, meaning they cannot facilitate any more students, the supervior's 'Availability' is set to 'No,' while the student's is 'Yes.' The pair's 'Approved' field is also set to 'Rejected.' If instead the supervisors remaining capacity is not equal to 0, it is decreased by 1, and the student's 'Availability' is set to 'No.'
Post Match Data to Supervisor-Student Pairs Table
This flow is triggered when a http POST request is sent to a specific API, with a payload containing JSON data in a particular format. When the API receives this POST request, the flow begins and extracts the JSON from the payload. This JSON data is the output of the matching algorithm. It contains a list of supervisors, each with a list of students that they matched with and the reason for the match (common ACM keyword). These are each extracted by the flow and inserted into the 'Supervisor-Student Pairs' table. From the JSON data, the supervisorResponseIds and studentResponseIds are used to get the name of the user that submitted the response. This is what is entered into the table, rather than the responseId iteself. Each inputted pair is also given a default 'Approved' field value of 'Pending' which can later be edited by the administrator.
HTTP Get Flows
These 5 flows all work in the exact same way and exist for the same purpose - to extract data from our database into the JavaScript code that is powering the matching algorithm process. These flows are all triggered when a the flow's API receives an http GET request. This is a plain request with no payload. Each flow will then send back the appropriate data from the correct table. Not all fields are of use to the matching algorithm, so only a select few are sent back to the JavaScript in the http response. The table that has it's data pulled is described in each flow's title. These flows could be combined into one huge flow and all the data could be sent in one response, however this would be a staggering amount of data, making the response not readable at all to humans that need to possibly debug the code in the future. Seperating them also makes the JavaScript code more orthogonal as the functions are not doing more than one task.
Liquid is a template language, written amongst HTML that uses a combination of objects, tags, and filters inside template files to display dynamic content. Liquid, like any template language, creates a bridge between an HTML file and a data store — in our context, the data we use for matching.

Liquid can be learnt fairly quickly using the official documentation. Here are some examples where we used Liquid in our project:
Hiding Sensitive Data During Development
Displaying User-Specific Information
Iteration to Display Elements in an Array
We had initially decided to use Project Operations for the supervised matching section of our solution. However, after closer inspection of how the matching works in Project Operations, we decided that we wouldn’t have enough control over how students were matched to supervisors. Instead, we created a custom algorithm using Javascript.

Our matching process can be split into 5 main, distinct components.

Collection of Data
Since our code is written in pure JavaScript, the required data is pulled from our solution’s database using simple http GET requests. There are 5 different requests, that pull 5 separate sets of data. This data is initially collected using PowerAutomate instant cloud flows (As seen above). The flow can be broken down into three steps.

Next, the flow performs a ‘list rows’ operation on the specified table to get all of the lists into a JSON (JavaScript Object Notation) format. Only the rows containing data necessary for matching are included to reduce the size of the payload sent back to our code. In this case, the rows: 'acmclassificationcodesid', 'keyword', and 'id' are pulled from the 'ACM Classification Codes' table.

As you can see by this image, the flow is waiting for a GET request to its specified API. Upon receiving this request. Since the request doesnt need to send any data initially, the payload format is left empyty.

Finally, an http request of code 200 is sent back, including the JSON formatted list of data. This procedure is repeated 5 times, once for each different table that we require data from.

Cleaning the Received Data
When the data for each table is received by the code, it needs to be cleaned and put into a named JavaScript object, so that it can be easily accessed by the algorithm. Here is an example of the aforementioned get requests. The data is initially fetched from the API of the flow pulling data from the desired table. The data is then parsed into a new JavaScript object, with more human-readble attribute names. Data is also 'cleaned' where it needs to be. For example, in this function, we remove any records that do not have a specified 'responseId.'

Parsing of the Data
The received student data and supervisor data is then parsed into a more usable format. This is done for multiple reasons. Firstly, when a student fills out a response form, they choose which rough topics they have interests in. Each one of these topics relate to hundreds of ACM keywords. Therefore the ACM keywords need to be collated into one student response object. Secondly, when a supervisor fills out a response form, a news Supervisor-ACM table record is generated for each ACM keyword they selected as relevant to them. To make matching more simple, these too are collated so that there is only one supervisor response object per unique supervisor, having many ACM keywords related to it.

The below example shows the supervisor response parsing function. It can be seen that when sorting through Supervisor-ACM table records, responses that have a responseId that has already been used, have their ACM keywords appended to the already exisiting object, rather than creating a new one.

Finding All Possible Matches
The next process is to find all possible students matched to each supervisor. These matches are found based on whether the supervisor and student share an common ACM keyword in their responses. If a student is found to match to a supervisor, the student's response object is inserted into an array of 'matched students' for the supervisor. This also contains the ACM keyword that they matched on, and the priority that the student assigned to it. This process is repeated for all students for all supervisors.

Below is the actual code for finding matches. It simply loops through each parsed supervisor response object (formed in the previous step) and calls the findMatchingStudents function. The findMatchingStudents function then loops through each parsed student response object and sees if the response contains an ACM keyword that is also present in the current supervisor resposne object. If so, the student object is added to the supervisors list of matched students, as described above. The list of matches to that sueprvisor is returned. This is repeated for every supervisor.

Ordering
Then, for each supervisor, the matches are ordered based on the priority that the student assigned to the ACM keyword that they matched with the supervisor on.

Reviewing the Matches
The final stage of the code for the matching process, is the matches review. For each supervisor, the matches that were found are reviewed one by one. Using the total number of unique student responses and supervisor responses, a minimum number of matches per supervsior is calculated. Each professor then has that many of their students matches 'finalised' and inserted into a final matches array. This is why the student matches list was orderd for each professor in the previous step.

This section of code also creates a list of any supervisors that were left with no students matched to them at the end of this process. This should only ever happend because there were more supervisors than students, not because there were no students interesting in doing what the supervisor was willing to supervise - although that is still theoretically possible.

In a similar way, this piece of code finds any students that could't be matched to any of the supervisors. Again, in practice this will probably never happen since the topics that student choose as interests in their response encompass a vast amount of ACM keywords. However, since it is still techinally possible, it is handled, since it would be usefull to the administrator.

Sending the Matches Back to Our Solution
After all of the matches between students and superviors have been reviewed, and matchless supervisors and students have been found, the data is properly formatted and sent back to our solution's database using an http POST request.

There is another flow in our solution waiting for a POST request to a specific API, expecting a payload containing a specifically shapped JSON.

The flow then iterates through the payload to extract all 'pairs' between a supervisor and student. From each pair, it takes the studentResponseId and supervisorResponseId associated with the pair. It uses these ID's to get the contacts of the people that submitted the responses.

The rest of the data in the pair is then extracted and a new 'Supervisor-Student Pairs' table record is generated.

As described in the above section, the results from the matching algorithm are inputted into the 'Supervisor-Student Pairs' table, which is viewable by the administrator. Here the matches can be reviewed and then approved or rejected. When a pair record is editted as shown below, a PowerAutomate instant flow is run, which performs the following logic.

If a pair is rejected by an administrator, the involved student's availability is set to 'Yes' and if the supervisor's remaining capacity is less than their total capacity, the supervisor's availability is also set to 'Yes.'

If a pair is instead accpeted by an administrator, if the supervior's remaining capacity is equal to 0, the student's availability is set to 'Yes,' the supervisor's availability is set to 'No' and the pairs status is set to 'Rejected.' Otheriwise, the students availability is set to 'No' and the supervisor's remaining capacity is reduced by 1.

The administrator should review matches and rerun the algorithm until they are completely satisfied with the outcome.
Below is a diagram showing the relationship of all tables listed in the Table Descriptions section:

1:N, N:1, and N:N Relationships
In PowerApps, a N:1 relationship is called a Lookup. A 1:N relationship is simply a reversed Lookup (lookup from the other table into this one). Creating either of these will create a column in the table that holds the lookup, and you must give this column a name. N:N relationships are not tied to a column the same way and one does not need to be created to create an N:N relationship. A “hidden” table is created to enable an N:N relationship. This table cannot be accessed through Dynamics – it can only be queried using FetchXML in XRMToolBox. If you require the data that would be stored in this table please reference the Issues, Workarounds, and Ideal Solution section. Detailed instructions on creating relationships can be found here.

Subgrids
Subgrids can be used to relate records in a N:N relationship. 2 subgrids are used on the Supervisor Application (advanced) form to receive input about the user’s Research Groups and ACM Keywords. Subgrids can be used on advanced forms but not on the first tab, if the form has multiple steps. Instructions on adding and configuring a subgrid can be found here.

Outcomes
This was an extremely interesting project for the students and provided the team with exposure to varied number of new technologies and services.

The team built two web apps, one for the students and supervisors to interact with, and one with higher privileges for the administrator to oversee the process. They brought every stage of the project allocation process to one site: academics register to supervise students, students browse through these options to reach out to supervisors, students who haven't find a supervisor apply to be automatically allocated one, and matches are made based on common interests. Whereas these stages used to all happen through different mediums, everything has been brought to a single portal.

They maintained close correspondence with the lead of our own Final Year Project module to get feedback on our progress and update our course of action. As someone who would be using our solution, his positive feedback and approval of our solution was convincing confirmation that we had produced a good outcome.

FYP Portal

Model Driven App

It was great to see how the team approached the project and documented their outcomes, insights and experiences and provided engineering Feedback to Microsoft on their experiences of using the various tools and services.

More details on the project
Team 42 - Modern Workplace Automation (james-parky.github.io)

Project Goals

Requirement Gathering

Students

Supervisors

Administrators

Personas

Liquid

Matching Algorithm

Table Relationships

1:N, N:1, and N:N Relationships

Subgrids

Leave a Reply Cancel reply