Joint Seat Allocation: An algorithmic perspective, Study notes of Algorithms and Programming

This technical report discusses the issues with the previous admission process for Indian Institutes of Technology (IITs) and non-IIT Centrally Funded Government Institutes (CFTIs) and the implementation of a new combined seat allocation process. The report describes the Multi-Run Multi-Round Deferred Acceptance scheme used for the new process, which handles multiplicity of merit lists across different institutes and programs and ensures fair and optimal seat allocation. The document could be useful as study notes or summary for university courses related to algorithms, admissions processes, and optimization.

Typology: Study notes

2021/2022

Uploaded on 05/11/2023

goldr4k3
goldr4k3 🇺🇸

4.4

(31)

286 documents

1 / 85

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Joint Seat Allocation: An algorithmic perspective
Technical Report
Surender Baswana Partha P. Chakrabarti V. Kamakoti Yash Kanoria
Ashok Kumar Utkarsh Patange Sharat Chandran
September 2015
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55

Partial preview of the text

Download Joint Seat Allocation: An algorithmic perspective and more Study notes Algorithms and Programming in PDF only on Docsity!

Joint Seat Allocation: An algorithmic perspective

Technical Report

Surender Baswana Partha P. Chakrabarti V. Kamakoti Yash Kanoria

Ashok Kumar Utkarsh Patange Sharat Chandran

September 2015

Abstract

Until 2014, the admissions to the Indian Institutes of Technology (IITs) were conducted under one umbrella, whereas the admissions to the non-IIT Centrally Funded Government Institutes (CFTIs) were conducted under a different umbrella, the Central Seat Allocation Board (CSAB). The same set of candidates were eligible to apply for a seat in each of the two sets of institutes, and several hundred candidates would indeed receive two seats from the two different sets. Each such candidate could use at most one of the seats, leaving a vacancy in the other seat; this would be noticed much later, in many cases after classes began. Such seats would either remain vacant or would be reallocated at a later stage, leading to inefficiency in seat allocation in the form of unnecessary vacancies, and also misallocation of seats (e.g., a particular CSAB seat could be offered to a candidate A, despite denying the same seat earlier to a candidate B with better rank, who had meanwhile taken some IIT seat). The two umbrellas also operated under different admission windows, with the net result that classes would begin later in the academic year, compared to, say colleges offering the sciences, or the arts. In 2015, a new combined seat allocation process was implemented to resolve some of the issues. The process brought all CFTIs under one umbrella for admissions – 86 institutes with approximately 34000 available seats. Each candidate submitted a single choice list over all available programs, and received no more than a single seat from the system, based on the choices and the ranks in the relevant merit lists. We describe the new Multi-Run Multi-Round Deferred Acceptance scheme that was used for this combined allocation process. Crucially, unlike the 2014 and earlier admissions processes, the scheme seamlessly handles multiplicity of merit lists across different institutes and programs; indeed every program may have a separate merit list, and these lists need not have any relation with each other. In addition, the scheme has several other desired objectives. The scheme makes it safe and optimal for candidates to report their true preferences over programs. The seat allocation produced does not waste seats and is fair in the sense that it does not give a seat to a lower ranked candidate when it was denied to a higher ranked candidate. Further, the allocation is optimal in a formal sense, providing each candidate with the best possible seat subject to fairness. Without compromising on these tenets, the scheme factors in various business rules including reservations for different birth categories, reservations for home state candidates, and rules regarding dereservation when sufficient candidates are not available. The scheme also factors in changes that are inevitable when it is discovered, for instance, that candidates have inadvertently or otherwise incorrectly declared their birth category, or when it is discovered that the qualifying criteria have been incorrectly recorded by state education authorities. Our scheme is inspired by the single run Deferred Acceptance algorithm attributed to Gale and Shapley. In this report, and based on the experience of running the scheme in July 2015. we also present several important implementation considerations, and some recommendations for the future. We strike two notes of caution that is necessary to maximally improve the efficiency of the seat allocation: (i) the calendar should be suitably constructed so that a common allocation is implementable both in theory and practice. (ii) educating candidates to fill choices properly is outside the scope of this report but an equally crucial, and difficult task that must be given attention. In the end, the ability to gracefully handle multiple merit lists gives us hope to express optimism that all undergraduate engineering admissions in the country, beyond the CFTIs, can beneficially use the suggested scheme.

  • 1 Introduction
    • 1.1 Organization of the report
  • 2 Preliminaries
    • 2.1 Background and Challenges
    • 2.2 Formal Problem Statement
    • 2.3 Previous Work
  • 3 A Combined Seat Allocation Scheme
    • 3.1 Initial Information collection
    • 3.2 Basic DA algorithm
    • 3.3 Pseudocode
      • 3.3.1 Multi-Round Scenario
      • 3.3.2 Remarks
  • 4 Business Rules
    • 4.1 Incorporating Quotas
      • 4.1.1 Virtual programs
      • 4.1.2 Virtual Preference List
      • 4.1.3 Virtual Merit Lists
      • 4.1.4 Preparatory courses allocation
      • 4.1.5 Updated Algorithm: DA With Quotas
    • 4.2 DA with multiple candidates having same rank
    • 4.3 DA with International Students
    • 4.4 DA with candidates from defense service quota
      • 4.4.1 Algorithm
    • 4.5 DA for Admission into IITs, NITs, and other GFTIs
      • 4.5.1 Virtual Preference Table for IITs
      • 4.5.2 Virtual Preference Table for NITs
      • 4.5.3 Virtual Preference Table for Other GFTIs
      • 4.5.4 Summary
  • 5 Multi-round Implementation
    • 5.1 MRDA
    • 5.2 Updating inputs after first round
    • 5.3 Pre-processing applicable only after third round
    • 5.4 Preference list editing
    • 5.5 Other Inputs Needed After First Round
    • 5.6 Summary
  • 6 De-reservations
    • 6.1 Multi-Run DA with De-reservation
      • 6.1.1 Multi-Run DA Algorithm
  • 7 Recommendations
    • 7.1 Business Rules Considerations
    • 7.2 Closure Round
    • 7.3 Choice filling
    • 7.4 New Institutes
    • 7.5 Data collection and analysis
    • 7.6 Algorithmic Considerations
  • 8 Appendix
    • 8.1 Proof of Correctness
    • 8.2 Computational considerations of the DA algorithms
    • 8.3 Single Run DA with De-reservation
    • 8.4 Computing Min-Cutoff and setting Cat-Change
    • 8.5 Choice Filling
    • 8.6 Survey Questions
    • 8.7 Detailed algorithm for handling DS candidates
      • 8.7.1 Example of Race Condition
      • 8.7.2 Detecting Race Condition
      • 8.7.3 Pseudocode
    • 8.8 Validation Modules
      • 8.8.1 Candidate specific validation modules
      • 8.8.2 Program specific validation modules
      • 8.8.3 Test Assisting modules
    • 8.9 Implementation Details of MRDA
      • 8.9.1 Algorithm: interface and interactions
      • 8.9.2 Input format
      • 8.9.3 Output format
    • 8.10 Some statistics of Joint Allocation

Chapter 1

Introduction

Allocation of seats (earlier, counseling) to colleges is a key step in the admission process from an institutional perspective. It decides the final fruit of all the hardships taken by a candidate over years of schooling and special preparations for competitive examinations. Importantly, this decision shapes the candidate’s future career. Thus, a lot of care needs to be taken to ensure that the right fruit is delivered to the right candidate. Two different points of view need to be considered to correctly allocate seats – the point of view of the candidates, and the point of view of the participating institutions. All candidates should get the best available choice, and every institution should fill all seats in the varying courses they offer. Consistent efforts have been taken by different organizations in charge of counseling to satisfy both viewpoints. Many institutions have grown to a stature of national importance. Each of them have their own entrance examination for admission and also have a separate rank list and a separate allocation process. Previously, the candidate filled a choice sheet for each of these institutions, or sometimes, group of institutions. Thus, there was no provision in the past for the candidate to list her choices in a single choice sheet across all institutions. Every choice sheet he filled could indicate only the preferences of the candidate among available choices in the institution he applies to. As a result, the candidate who filled K choice sheets may receive admission to up to K programs from which he has to select one. However, as counseling dates are not synchronized, and there is no legal provision for overbooking seats, the candidate may rationally block more than one seat by paying required fees for safety reasons. At most one of these seats is occupied by the candidate, whereas the remaining seats go vacant. As different allocation processes have no mechanism to track this, many seats in unpopular (in the eyes of the young candidates) courses end up being unfilled. At the same time, many candidates in the waiting lists could not be admitted as the decision of the deserting candidates who have been allotted a course is known only after the courses start and that becomes too late. Different institutions and admission boards have tried different mechanisms to alleviate this problem, which in our opinion are not very effective, leading to allocative inefficiency as well as logistical difficulties for candidates and institutes. Some of these mechanisms are:

of conducting combined admissions across a set of institutes, as well as the allocative efficiency of the process. The first practical issue is that each of the merit lists needed for admissions to all involved programs must be ready before admissions are conducted.^6 This may limit the extent to which the set of involved institutes can be expanded. The second issue is that there are always going to be colleges that are not part of the system, but are of interest to candidates who are participating.^7 Thus, a substantial fraction of candidates will vacate the allotted seat, in many cases even after paying the fees and officially accepting the allocation. We discuss methods to minimize the impact of such vacancies in Chapter 7.

1.1 Organization of the report

Chapter 2 presents the problem formally along with introducing some basic notations and ter- minologies to be used in the rest of the report. Chapter 3 first presents the deferred acceptance (DA) algorithm in its simplest form. Later it describes the version that handles two important aspects of multiple rounds, namely, Min-Cutoff and Cat-Change. Chapter 4 describes some fundamental business rules for admissions into three types of institutes, and how they are incor- porated using virtual programs and virtual preference tables. Chapter 5 presents the complete description of the multiple round DA algorithm. Chapter 6 presents the way de-reservation is handled in each round by executing multiple runs of our algorithm. In view of this consid- eration, we term our scheme as MRDA, a multi-run deferred acceptance scheme. Chapter 7 presents the recommendations of the technical committee based on the outcome of the Joint Seat Allocation 2015. These recommendations should be considered seriously for future years. Our scheme was first described in Feb 2015, and later implemented in July 2015. In Chap- ter 8, which is the Appendix, we provide the details of the following aspects of MRDA algorithm and the Joint Seat Allocation 2015.

  1. Validation modules required to ensure that the output meets all business rules.
  2. Some statistics of Joint Seat Allocation 2015.
  3. Implementation details of MRDA.

The appendix also contain some notes on the proof of correctness, computational considerations, and an alternate way to handle de-reservation in a single run of MRDA.

(^6) In 2015, this posed a real problem. The CSAB merit list preparation required board exam marks, which were obtained only on July 1 (after intense efforts), whereas the first round had been planned to begin on June 25th; thus the first round had to be postponed by about a week. (^7) There are about 1.5 million engineering seats in India, whereas only about 34,000 of these seats were filled using the system described here. Even if, hypothetically, all engineering colleges in the country become part of a single combined system, there will still be candidates who choose not to study engineering, or to study outside the country.

Chapter 2

Preliminaries

2.1 Background and Challenges

In the case of a single merit list, the candidates are sequentially ordered, and allotment of seats is done by processing candidates in the same order and allowing each candidate to choose from among the seats that are still available.^1 For example, consider three candidates, A, B and C. If in the NIT merit list, they are ranked as A, B and C, then, given the choices of A, B and C for different branches, we give preference to A over B and C. Likewise, we give preference to B over C. The allotment process first allocates a seat to A, then to B, and finally to C. In practice, the process is much more intricate as one has to take care of various categories like GEN, OBC-NCL, SC, ST, PwD, their cut-offs, then recouping of unallocated OBC-NCL and PwD seats. Nevertheless, a solution can, and has been devised. Unfortunately, if we have multiple merit lists, the issues are much more complex. Consider a simple scenario of three merit lists, namely those of the IITs, the NITs and a merit list for the Architecture (ARCH) program in which an additional examination needs to be cleared. Again consider our 3 candidates, A, B and C. Let the three merit lists be as follows:

NIT merit List: (1, A), (2, B) and (3, C) (as mentioned earlier)

IIT Merit List: (1, B), (2, C), and (3, A)

ARCH merit List: (1, C), (2, A) and (3, B)

Note that now, unlike in the case of a single merit list, there is no overall strict ordering among A, B and C as their relative performance is different in the three examinations. In practice, we may have several merit lists covering tens of examinations and lakhs of candidates. Now consider a joint seat allocation process where each candidate fills up a single choice sheet. An example is shown below where we assume, for illustration, that there is only a single seat available in each program.

(^1) This is known as serial dictatorship in the literature [7].

This document presents details of design, analysis and implementation of a scalable solution to the problem mentioned above.

2.2 Formal Problem Statement

We start with a simplified problem with multiple merit lists for the different programs^3. Differing programs may have the same merit list, or may have different merit lists. As discussed earlier, the problem is complex, but we term it ‘simplified’ for now because (for example) there are no quotas or ties in the presentation of this section. Let P be the set of programs, with P = |P| being the number of programs. For a program p ∈ P, let c(p) denote the number of seats in p. Let A denote the set of applicants (or candidates), with A = |A| being the number of candidates. Each candidate is allowed only one seat in the system. Candidates are asked to submit a choice (or preference) list over programs; the choice list is a strictly ordered list containing any subset of the programs in P. We denote the preference list of candidate x by

Pref(x) = px, 1 , px, 2 ,... , px,n(x) (2.1)

which means that candidate x has listed n(x) programs, with program px, 1 being her top choice, program px, 2 being her next choice and so on. The candidate is asked to list only programs she is interested in. Each program must submit its capacity c(p), as well as its merit list of candidates, which is a strictly ordered list containing a subset of the candidates in A. We denote the merit list of program p by Merit(p) = xp, 1 , xp, 2 ,... , xp,m(p) (2.2)

which means that program p has ranked m(p) candidates in its merit list, with candidate xp, 1 ranked 1 , candidate xp, 2 being ranked 2, and so on. Let μ(x) ∈ P be the program allotted to candidate x by some mechanism, with some candidates possibly not getting any seat, in which case we write μ(x) = φ. Denote the overall allocation by μ¯ = (μ(x))x∈A. We want a mechanism with the following properties:

  1. Fairness: Suppose candidate x is allotted program p. Then for any other candidate y such that y has a better (smaller) rank than candidate x in the merit list of p, the allocation of y should be p or some other program that y prefers to p.^4 Further, the mechanism must ensure that a candidate is not allotted a program that she did not list, and that no program is allotted to a candidate that was not a part of the merit list of the program.^5 (^3) One may view the word ‘program’, ‘course’, ‘branch’ and ‘programme’ interchangeably in the rest of the document. For ease of understanding at this stage, one may think of a program as a college having a single course, say, Electrical Engineering in IIT Kharagpur. (^4) This property is called stability in the literature [4]. (^5) This property is called individual rationality in the literature [7].
  1. Optimality: There does not exist any other allocation ¯μ′^ that satisfies the Fairness prop- erty, and provides any candidate x with an allocation she prefers to μ(x) based on her preference list.
  2. Truthfulness: The mechanism must make it optimal for candidates to report their true preferences.

A priori it is unclear that a mechanism satisfying all these properties exists. However, it turns out that such a mechanism does exist, and was constructed by Gale and Shapley [4]).

2.3 Previous Work

An initial solution to the simplified problem was proposed by Gale and Shapley in 1962 [4] by formulating it as a “Stable marriage problem". The proposed solution was shown to have a mul- titude of desirable properties including fairness and candidate optimality [4], and truthfulness [3] (no student or group of students can benefit from misreporting their preferences, see The- orem 14). The Gale and Shapley mechanism has been adapted and implemented successfully in a multitude of real world settings, e.g., the National Residency Matching Program (NRMP) (running in the USA since 1951, redesigned in 1999 [6]), New York City high school admissions since 2003 [1], and school and college admissions in Hungary [2]. However, our problem involves a variety of business rules governing flows of different systems that need to be streamlined for evolving a sound process of common allocation. We need to suitably adapt the Gale-Shapley mechanism to incorporate these business rules while retaining all these desirable properties. In the sequel, we demonstrate these and come up with a practical algorithm.

applications, using the information that the candidates and programs have provided. We now state the algorithm in words. See Section 3.3 for complete pseudocodes. Input:

  • For each program, its capacity and rank list of eligible candidates.
  • For each candidate, a preference list of programs.

Algorithm:

  1. All candidates apply (in any order) to the first program in their preference list.
  2. Each program p considers the applications it has received. Applications from candidates who are not eligible are immediately dropped. Let the capacity of the program be c(p) > 0. If the program has received c(p) or fewer eligible applications, then it retains all candidates on a waitlist. Otherwise, it ranks the candidates^1 making these requests (as per the merit list of the program) and retains only the c(p) best candidates on its waitlist, and rejects other candidates. If no rejections are made by any program, the algorithm terminates.
  3. Only rejected candidates apply (in any order) to the next program on their list, if any, and the algorithm returns to Step 2. If not even a single application is generated, then the algorithm terminates.

Output: When the algorithm terminates, the (final) “wait list” for each program p constitutes the candidates admitted to program p. We present complete details of the DA algorithm through pseudocode in the following sections. A formal proof of correctness and other computational considerations of the algorithm are described respectively in Section 8.1 and 8.2 in Appendix.

3.3 Pseudocode

Denote rank of candidate x with respect to Merit(p) by Rank(x, p). For a list l, denote the number of entries in the list by Length(l). We narrate two versions of the pseudocode. In the first version the assumption is that the entire seat allocation happens in a single round. It is presented to understand the general flow of the algorithm. The second version assumes a variety of complications; in particular, it allows multi-round scenarios as described in Chapter 5. Further, it also allows complicated scenarios when the ranks of candidates can change between rounds (due to revision of marks), and when category of candidates change between rounds, and these two cases are treated differently by the business rules.

(^1) assuming for the moment that equal ranks do not exist; ties are handled in a later section.

Algorithm 1 Deferred Acceptance Simple Version INPUT: Candidates A, Programs P Preference list Pref(x) for each x ∈ A Capacity c(p) and merit list Merit(p) for each p ∈ P

OUTPUT: For each candidate x ∈ A, the allocation μ(x) ∈ P ∪ {∅} Also for each program p ∈ P, the list of admitted candidates WaitList(p)

1: for all p ∈ P do 2: Create an empty ordered list WaitList(p) that will consist of 3: candidates ordered by their rank in Merit(p) 4: end for 5: Create an empty queue Q 6: for all x ∈ A do 7: i(x) ← 1. Initialize list position to 1. 8: if Length(Pref(x)) > 0 then 9: Enqueue(x,Q). x enters queue Q 10: end if 11: end for 12: while Q is non-empty do 13: x ← Dequeue(Q). x is any candidate removed from queue Q 14: p ← px,i(x). x applies to program px,i(x) 15: if x is not eligible for p then 16: Reject(x) 17: continue 18: end if

3.3.1 Multi-Round Scenario

Many candidates who obtain seats in the first round of seat allocation will surrender or reject their respective seats at a later stage. In order to utilize these surrendered seats, the business rules allow multiple rounds of seat allocation. Our pseudocode allows for additional optional inputs in order to implement second and later rounds of seat allocation. The full description of how to use Algorithm 2 to implement the seat allocation in each round is provided in Chapter 5.

  • Min-Cutoff(p) for each p ∈ P. This quantity is used in second and later rounds of allocation (see Chapter 5 for details). Intuitively, a candidate with rank better than or as good as Min-Cutoff(p) will never be rejected by p regardless of the capacity of p. Candidates offered a seat in the first round will thus be no worse off in subsequent rounds. Note that there might also be candidates whose rank improve (or become worse) in subsequent rounds. In such cases, the allocated seat shall be what the candidate would have got on the basis of the revised rank in the first round. This seat can be a supernumerary seat. We use the notation x|y to indicate that Rank(x, p) is as good as, or better than Min-Cutoff(p), and Rank(y, p) is worse.
  • Another optional input is the list Cat-Change which consists of candidates who might have had their category changed. For example, a person presumed to be with an OBC- NCL in an earlier round may be reclassified as a general category candidate. As per the business rules, in this case, the candidate will be allocated a seat in the best (in terms of the filled-in choices) possible choice of academic program that has unfilled seats and supernumerary seats are NOT created.
  • The starting position on the preference list i(x) for each x ∈ A and the current queue Q of candidates to be processed. These starting positions inputs are meant as an optimization mechanism and can be ignored in the first reading of the pseudo-code.

3.3.2 Remarks

Although we have described Algorithm 2 keeping in mind Cat-Change and Min-Cutoff, the key take away of the algorithm is that we have kept two considerations in mind in the multi-round scenario. In one case, candidates get what may be termed as the Min-Cutoff benefit: seats obtained in an earlier round are guaranteed, and so supernumerary seats may be created. In the second case, candidates may have the Cat-Change penalty: although they may clear the Min-Cutoff they are denied the benefit of creating supernumerary seats. The two types of cases can be mapped in a variety of scenarios; for example, Cat-Change may be replaced by credential change candidates, i.e., candidates whose credentials were changed for some reason moving from round i to round i + 1. See Section 5.5 for more details.

Algorithm 2 Deferred Acceptance (Full version allowing multiple rounds) INPUTS: Candidates A, Programs P For each x ∈ A: Preference list Pref(x) Optional input: integer i(x). Default value i(x) = 1. (Start from beginning of preference list by default.) For each p ∈ P: Capacity c(p) and merit list Merit(p). Optional input: WaitList(p). Optional input: Min-Cutoff(p). 0 by default. Optional input: Q a queue of candidates. By default contains all Indian candidates x with Length(Pref(x)) > 0. Optional input: Cat-Change. A list of candidates whose category has changed (empty by default) OUTPUTS: For each candidate x ∈ A, the allocation μ(x) ∈ P ∪ {∅} and i(x). Also for each program p ∈ P, the list of admitted candidates WaitList(p)

1:... Everything up to Line 11 in Algorithm 1 2: while Q is non-empty do 3: x ← Dequeue(Q). x is any candidate removed from queue Q 4: p ← px,i(x). x applies to program px,i(x) 5: if x is not eligible for p OR c(p) = 0 then 6: Reject(x) 7: continue. move to next person in Q 8: end if 9: y ← Last candidate in WaitList(p). y can be null 10: w ← Last candidate in WaitList(p) ∩ Cat-Change. w =? y

51: for all x ∈ A do 52: if i(x) ≤ Length(Pref(x)) then 53: μ(x) ← px,i(x) 54: else. x reached the end of her list 55: μ(x) ← ∅ 56: end if 57: end for 58: return μ(x), i(x) for all x ∈ A and WaitList(p) for all p ∈ P 59: function Reject(x) 60: Increment i(x) 61: if i(x) ≤ Length(Pref(x)) then. x wants to apply further 62: Enqueue(x,Q). x enters queue Q again 63: end if 64: end function 65: function RemoveAndReject(w, p) 66: Remove w from WAITLIST(p) 67: Reject(w) 68: end function 69:. The next function is identical but is written for clarity 70:. to indicate penalty for category change people. 71:. This function will change when there are ties 72: function PenaltyRemoveAndReject(w, p) 73: Remove w from WAITLIST(p) 74: Reject(w) 75: end function

In subsequent sections, we show how various business rules of IITs and NITs (such as handling quotas, applicants with equal ranks, etc.) can be incorporated seamlessly in this algorithm.