Improving Efficiency of Project Matching for educators and students

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Meet the Team

LeeStott_7-1668076534127.png

James Parkington

Matching Algorithm Engineer, Web Role Secuity Engineer, Portfolio Developer

LeeStott_8-1668076533990.jpeg

Rory Nicholas

Spokesperson, Portfolio Developer, Front-end Developer

LeeStott_9-1668076534064.png

Teodora Lovin

Team Lead, System and Database Architect, Client Liaison


Project Abstract

Universities around the world have the annual task of allocating each of their final-year students a supervisor. Many of these universities, including University College London 'UCL', still sift through large spreadsheets manually looking for matches between students and supervisors. UCL also did not have a standardized way to facilitate communication between prospective supervisors and students.

The team of students at UCL partnered with Microsoft to showcase how a problem like this one could be elegantly solved using their Dynamics 365 products. 

Technologies

Our project is powered by the follwing Microsoft technologies, allowing us to create a tailored solution to the problem we've been presented while incorporating smooth integration with exisiting UCL infrastructure.

 

LeeStott_30-1668077233574.png

 

 


Dataverse

Dataverse lets you securely store and manage data that's used by business applications. Standard and custom tables within Dataverse provide a secure and cloud-based storage option for your data.

 


JavaScript

JavaScript is used to develop interactive web applications. JavaScript can power featured like interactive images, carousels, and forms.

 


Power Automate

Power Automate is a service that helps you create automated workflows between your favorite apps and services to synchronize files, get notifications, collect data, and more.

 


Power Apps

Power Apps is a suite of apps, services, and connectors, as well as a data platform, that provides a rapid development environment to build custom apps for your business needs.

 


Demo Video

We've produced a short video outlining the background of our project, how we went about solving it, and a demonstration of some key features of the finished project. You'll find this video below, however a higher quality vesion is available for viewing here.

final-8min from Rory Nicholas on Vimeo.

 

  • Must-have
    1. Academics can be manually and automatically allocated to students based on common interests.
    2. Academics can report the research groups they are a part of and ACM keywords that their work falls under when registering to supervise.
    3. Academics who have agreed on a project with a student can soft book them through the system.
    4. Administrators of the process must be able to approve of reject matches made by the system and any pre-allocation soft bookings made by users.

  • Should-have
    1. Academics hosting students annually can easily update and resubmit their previous application.
    2. Students can view available supervisors through a searchable list or table.

  • Could-have
    1. Filled applications are editable unless the administrator opts to make them read-only.
    2. The solution monitors student interest over many years.
    3. The solution automatically gathers supervisors' ACM keywords from published works.

  • Won't have
    1. The solution will not concern the allocation of industry students to industry projects.
    2. The solution will not provide a communication channel through which students can directly contact supervisors or vice versa


Implementation

Web Roles are very important to our solution as they dictate what any given user can and can't see/do on our portal. There are three roles available to a user - Student, Supervisor, and Administrator. Administrator roles should be assigned by the department using the Portal Management section of our solution. The other two roles, Student and Supervisor are chosen by the user when they log in for the first time.

 

When a user first signs in on the portal's landing page, they are presented with this screen. After using the radio buttons and submission button to select their role, they are taken to their chosen role's respective home page. They will not have another opportunity to change this and must instead ask an administrator to change it for them.

 

LeeStott_11-1668077025824.png

 

After clicking the above button, the following code will run, triggering the flow that will assign the correct web role to the current user based on which radio button they selected.

LeeStott_12-1668077025829.png

 


After a user has selected their role, whenever they next sign in, they do not see this role selected screen and are instead directed to their home page, as mentioned before. This is the code that does so.

LeeStott_13-1668077025841.png

 


These roles are used to stop a user from seeing content that they should not be able to. Similar to the above code, liquid is used on each of our Portal's web pages to check what roles are assigned to the current users contact. If they do not have sufficient roles, they are prompted to return to the sign in page.


Flows are pieces of logic which allow some level of automation to otherwise static objects, such as tables. Flows can be triggered by several things, like the manual press of a button, the modification of a particular table, or the arrival of a certain time of day. They can be used to do things like edit data in a table or send an email. Below are some of the most important ways in which we used flows:

  1. Add Contact Field to Supervisor ACM

    This flow is triggered by the addition/modification/deletion of a record inside Supervisor Responses v2. When a Supervisor Application is submitted/modified/deleted, a record is created inside Supervisor-ACM and the ID, ACM Keyword, and Supervisor Response ID fields are populated in that record. This flow is used to populate the remaining Contact field of this new record. The flow does this by extracting the Contact field from the newly modified Supervisor Response record and inserting it into the Supervisor-ACM records belonging to that supervisor.

  2. Create Student-ACM Record

    This flow is triggered by the addition/modification/deletion of a record inside Student Responses. When a Student Application is submitted/modified/deleted, a record is created inside Student Responses. We also want to create records inside Student-ACM, populating the columns ACM Keyword and Contact. The First-choice Interest field is extracted from the submitted record, and this is mapped to ACM Keywords using the Topic-ACM Maps. A record is created for the student associating them with each of these ACM keywords.

  3. Insert Topic-ACM Records

    This flow is triggered manually, and was created for the purpose of efficiently populating the lookup table which maps student topics to ACM keywords. It works by extracting all ACM keywords pertaining to a certain section and creating a row in the lookup table mapping the keyword to a particular student topic. For example, it may select all ACM keywords beginning with '2.1', and create a record mapping them to 'Algorithms and CS Core Theory'.

  4. Assign Web Role to Contact

    This flow is triggered when a user signs in on the Portal for the first time and selects a role. The flow is expecting a request with a JSON payload of a specific shape. This payload contains the id of the currently signed-in user, along with the id of the web role that they just selected. The flow then relates the record of the currently signed-in user in the Contact table with the record of the selected web role in the Web Roles Table. Doing this gives the permissions of that web role to the user.

  5. Update Student Responses Availability

    This is the longest and most complex flow in the solution and is triggered by the addition/modification of record inside the Supervisor-Student Pairs table. When a pair record is added or editted in this table, there are two things that may happen depending on the value of the 'Approved' field in the record in focus.

    If this has a value of 'Rejected,' meaning the supervisor didn't approve the match, the student response of that match has it's 'Availability' field set to 'Yes.' Then, if the matches supervisor has a remaining capacity less than the actual capacity, the supervior's 'Availability' is set to 'Yes' as well.

    If instead the value is 'Approved,' if the matches supervisor's remaining capacity is equal to 0, meaning they cannot facilitate any more students, the supervior's 'Availability' is set to 'No,' while the student's is 'Yes.' The pair's 'Approved' field is also set to 'Rejected.' If instead the supervisors remaining capacity is not equal to 0, it is decreased by 1, and the student's 'Availability' is set to 'No.'

  6. Post Match Data to Supervisor-Student Pairs Table

    This flow is triggered when a http POST request is sent to a specific API, with a payload containing JSON data in a particular format. When the API receives this POST request, the flow begins and extracts the JSON from the payload. This JSON data is the output of the matching algorithm. It contains a list of supervisors, each with a list of students that they matched with and the reason for the match (common ACM keyword). These are each extracted by the flow and inserted into the 'Supervisor-Student Pairs' table. From the JSON data, the supervisorResponseIds and studentResponseIds are used to get the name of the user that submitted the response. This is what is entered into the table, rather than the responseId iteself. Each inputted pair is also given a default 'Approved' field value of 'Pending' which can later be edited by the administrator.

  7. HTTP Get Flows

    These 5 flows all work in the exact same way and exist for the same purpose - to extract data from our database into the JavaScript code that is powering the matching algorithm process. These flows are all triggered when a the flow's API receives an http GET request. This is a plain request with no payload. Each flow will then send back the appropriate data from the correct table. Not all fields are of use to the matching algorithm, so only a select few are sent back to the JavaScript in the http response. The table that has it's data pulled is described in each flow's title. These flows could be combined into one huge flow and all the data could be sent in one response, however this would be a staggering amount of data, making the response not readable at all to humans that need to possibly debug the code in the future. Seperating them also makes the JavaScript code more orthogonal as the functions are not doing more than one task.


    Liquid is a template language, written amongst HTML that uses a combination of objects, tags, and filters inside template files to display dynamic content. Liquid, like any template language, creates a bridge between an HTML file and a data store — in our context, the data we use for matching.

    Liquid can be learnt fairly quickly using the official documentation. Here are some examples where we used Liquid in our project:

    Hiding Sensitive Data During Development
        • At times during our development, we needed to display data that should be available to authenticated users only. To do this, we used Liquid control flow tags to show or hide data based on this condition:
        •    {% if user %}
        •      {% include 'my_page' %}
        •    {% endif %}

        • If the user is not signed in, 'user' will not exist, meaning the variable holds a value of 'Nil'. This means the 'user' variable is 'falsy', causing the condition to be evaluated as false and therefore not display my_page.

        • It should be noted that this is not the most sound and elegant solution to hide content from the unauthorised. PowerApps offers mechanisms to acheive this through Web Roles.

    Displaying User-Specific Information
        • It's often useful to have a webpage display different content depending on who the user is, and what information we have about them. Objects may have certain attributes which can be used to acheive this. As a simple example, we used:
        •   {{ user.fullname }}
        • to display a header which display the correct name for the user that's currently logged in.

    Iteration to Display Elements in an Array
      • Under different conditions, different content may need to be included in the header. We used Liquid to display all headings which should be displayed within a given Web Link Set. Here is a heavily simplified method with which we did this:
      •   {% for link in primary_nav.sublinks %}
      •     {% unless forloop.first %}
      •       <div style="vertical-divider"></div>
      •     {% endunless %}
      •     {{ link.name }}
      •   {% endfor %}

    We had initially decided to use Project Operations for the supervised matching section of our solution. However, after closer inspection of how the matching works in Project Operations, we decided that we wouldn’t have enough control over how students were matched to supervisors. Instead, we created a custom algorithm using Javascript.

    Our matching process can be split into 5 main, distinct components.

    1. Collection of Data

      Since our code is written in pure JavaScript, the required data is pulled from our solution’s database using simple http GET requests. There are 5 different requests, that pull 5 separate sets of data. This data is initially collected using PowerAutomate instant cloud flows (As seen above). The flow can be broken down into three steps.

        LeeStott_15-1668077143041.png

       

      Next, the flow performs a ‘list rows’ operation on the specified table to get all of the lists into a JSON (JavaScript Object Notation) format. Only the rows containing data necessary for matching are included to reduce the size of the payload sent back to our code. In this case, the rows: 'acmclassificationcodesid', 'keyword', and 'id' are pulled from the 'ACM Classification Codes' table.

      LeeStott_14-1668077143010.png

       

      As you can see by this image, the flow is waiting for a GET request to its specified API. Upon receiving this request. Since the request doesnt need to send any data initially, the payload format is left empyty.

      LeeStott_16-1668077143038.png

       

      Finally, an http request of code 200 is sent back, including the JSON formatted list of data. This procedure is repeated 5 times, once for each different table that we require data from.

    2. Cleaning the Received Data

      When the data for each table is received by the code, it needs to be cleaned and put into a named JavaScript object, so that it can be easily accessed by the algorithm. Here is an example of the aforementioned get requests. The data is initially fetched from the API of the flow pulling data from the desired table. The data is then parsed into a new JavaScript object, with more human-readble attribute names. Data is also 'cleaned' where it needs to be. For example, in this function, we remove any records that do not have a specified 'responseId.'

      LeeStott_17-1668077143020.png

       

    3. Parsing of the Data

      The received student data and supervisor data is then parsed into a more usable format. This is done for multiple reasons. Firstly, when a student fills out a response form, they choose which rough topics they have interests in. Each one of these topics relate to hundreds of ACM keywords. Therefore the ACM keywords need to be collated into one student response object. Secondly, when a supervisor fills out a response form, a news Supervisor-ACM table record is generated for each ACM keyword they selected as relevant to them. To make matching more simple, these too are collated so that there is only one supervisor response object per unique supervisor, having many ACM keywords related to it.

      The below example shows the supervisor response parsing function. It can be seen that when sorting through Supervisor-ACM table records, responses that have a responseId that has already been used, have their ACM keywords appended to the already exisiting object, rather than creating a new one.

      LeeStott_18-1668077143024.png



    4. Finding All Possible Matches

      The next process is to find all possible students matched to each supervisor. These matches are found based on whether the supervisor and student share an common ACM keyword in their responses. If a student is found to match to a supervisor, the student's response object is inserted into an array of 'matched students' for the supervisor. This also contains the ACM keyword that they matched on, and the priority that the student assigned to it. This process is repeated for all students for all supervisors.

      Below is the actual code for finding matches. It simply loops through each parsed supervisor response object (formed in the previous step) and calls the findMatchingStudents function. The findMatchingStudents function then loops through each parsed student response object and sees if the response contains an ACM keyword that is also present in the current supervisor resposne object. If so, the student object is added to the supervisors list of matched students, as described above. The list of matches to that sueprvisor is returned. This is repeated for every supervisor.

      LeeStott_19-1668077143015.png



    5. Ordering

      Then, for each supervisor, the matches are ordered based on the priority that the student assigned to the ACM keyword that they matched with the supervisor on.

      LeeStott_20-1668077143044.png



    6. Reviewing the Matches

      The final stage of the code for the matching process, is the matches review. For each supervisor, the matches that were found are reviewed one by one. Using the total number of unique student responses and supervisor responses, a minimum number of matches per supervsior is calculated. Each professor then has that many of their students matches 'finalised' and inserted into a final matches array. This is why the student matches list was orderd for each professor in the previous step.

      This section of code also creates a list of any supervisors that were left with no students matched to them at the end of this process. This should only ever happend because there were more supervisors than students, not because there were no students interesting in doing what the supervisor was willing to supervise - although that is still theoretically possible.

      LeeStott_21-1668077143046.png

       

      In a similar way, this piece of code finds any students that could't be matched to any of the supervisors. Again, in practice this will probably never happen since the topics that student choose as interests in their response encompass a vast amount of ACM keywords. However, since it is still techinally possible, it is handled, since it would be usefull to the administrator.

      LeeStott_22-1668077143035.png



    7. Sending the Matches Back to Our Solution

      After all of the matches between students and superviors have been reviewed, and matchless supervisors and students have been found, the data is properly formatted and sent back to our solution's database using an http POST request.

      LeeStott_23-1668077143033.png

       

      There is another flow in our solution waiting for a POST request to a specific API, expecting a payload containing a specifically shapped JSON.

      LeeStott_24-1668077143018.png

       

      The flow then iterates through the payload to extract all 'pairs' between a supervisor and student. From each pair, it takes the studentResponseId and supervisorResponseId associated with the pair. It uses these ID's to get the contacts of the people that submitted the responses.

      LeeStott_25-1668077143030.png

       

      The rest of the data in the pair is then extracted and a new 'Supervisor-Student Pairs' table record is generated.

      LeeStott_26-1668077143027.pngLeeStott_27-1668077166940.png

       

      As described in the above section, the results from the matching algorithm are inputted into the 'Supervisor-Student Pairs' table, which is viewable by the administrator. Here the matches can be reviewed and then approved or rejected. When a pair record is editted as shown below, a PowerAutomate instant flow is run, which performs the following logic.

      LeeStott_28-1668077166942.png

       

      If a pair is rejected by an administrator, the involved student's availability is set to 'Yes' and if the supervisor's remaining capacity is less than their total capacity, the supervisor's availability is also set to 'Yes.'

      If a pair is instead accpeted by an administrator, if the supervior's remaining capacity is equal to 0, the student's availability is set to 'Yes,' the supervisor's availability is set to 'No' and the pairs status is set to 'Rejected.' Otheriwise, the students availability is set to 'No' and the supervisor's remaining capacity is reduced by 1.

      The administrator should review matches and rerun the algorithm until they are completely satisfied with the outcome.


    Below is a diagram showing the relationship of all tables listed in the Table Descriptions section:

    LeeStott_29-1668077194461.jpeg

     

     

    1:N, N:1, and N:N Relationships
    In PowerApps, a N:1 relationship is called a Lookup. A 1:N relationship is simply a reversed Lookup (lookup from the other table into this one). Creating either of these will create a column in the table that holds the lookup, and you must give this column a name. N:N relationships are not tied to a column the same way and one does not need to be created to create an N:N relationship. A “hidden” table is created to enable an N:N relationship. This table cannot be accessed through Dynamics – it can only be queried using FetchXML in XRMToolBox. If you require the data that would be stored in this table please reference the Issues, Workarounds, and Ideal Solution section. Detailed instructions on creating relationships can be found here.

    Subgrids
    Subgrids can be used to relate records in a N:N relationship. 2 subgrids are used on the Supervisor Application (advanced) form to receive input about the user’s Research Groups and ACM Keywords. Subgrids can be used on advanced forms but not on the first tab, if the form has multiple steps. Instructions on adding and configuring a subgrid can be found here.


    Outcomes 
    This was an extremely interesting project for the students and provided the team with exposure to varied number of new technologies and services.


    The team built two web apps, one for the students and supervisors to interact with, and one with higher privileges for the administrator to oversee the process. They brought every stage of the project allocation process to one site: academics register to supervise students, students browse through these options to reach out to supervisors, students who haven't find a supervisor apply to be automatically allocated one, and matches are made based on common interests. Whereas these stages used to all happen through different mediums, everything has been brought to a single portal.

    They maintained close correspondence with the lead of our own Final Year Project module to get feedback on our progress and update our course of action. As someone who would be using our solution, his positive feedback and approval of our solution was convincing confirmation that we had produced a good outcome.

    FYP Portal

    Model Driven App


    It was great to see how the team approached the project and documented their outcomes, insights and experiences and provided engineering Feedback to Microsoft on their experiences of using the various tools and services.

    More details on the project 
    Team 42 - Modern Workplace Automation (james-parky.github.io)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.