STATISTICAL AND MATHEMATICAL ASPECTS IN UNDERTAKING RANKINGS: LESSONS DERIVING FROM POLISH PRACTICES
Author: Prof. dr. hab. Marek Rocki
The aim of the paper is to present both formal and methodological variety of the rankings. To some extent it is a proof for the thesis Ц obvious for many Ц that an objective ranking is non-existent.
Following a popular, commonly used but informal definition a ranking is a complied information concerning some object provided according to a criterion or a set of criteria. Formal details will be discussed later on in this presentation, however, now it should be stated that authors of the rankings of the educational institutions stress that the purpose of the rankings is to indicate the institution which is the best in terms of differently perceived quality. Reputation, the power of attracting, assets, elite, high quality, the best possible faculty members are the attributes describing the ranking winners.
The role of the authors of the rankings is to collect, verify and store data (information) and Ц in further stages - to process and publicise them. Results of the ranking are of primary interest to university candidates and their parents as well as to employers and educational institutions. The rankings provide candidates with the information on how universities, their units or syllabuses are classified against the background of other higher schools of a similar profile of studies. The rankings might sometimes provide some hints as for a candidateТs chance to be admitted to university.
Employers might find the result of a ranking a basis for searching potential employees. On the other hand, the ranking result provides university authorities with information on how its strengths and weaknesses are perceived by the others and thus it could be an instrument to improve quality.
Similar purposes characterise all those who conduct evaluation, rating, accreditation and ranking of tertiary educational institutions. The most common and clearly formulated role of evaluation, accreditation and ranking of educational institution is nothing else but stimulation of processes of increasing the quality of tuition. The above is the result of - not only in Poland - accessibility of education to huge numbers of people and emergence of educational market. In this respect Public Relations are of increasingly important role. Forthcoming Ц in Poland - demographic changes as well as prospective changes in legislation result in a fact that an image of educational institution combined with its brand are becoming a key factor to its further existence.
II. Rankings vs evaluation, certification, rating and accreditation.
From the perspective of a higher school a ranking is a specific form of informing its recipients about the condition of individual university education market players against the background of the others, basing on adopted rules and criteria. Yet, rankings should be distinguished from evaluation, certification, rating and accreditation.
Evaluation is the simplest form of appraising a tertiary educational institution. According to its vocabulary meaning it is aimed at determining the quality of university on the basis of a set of adopted criteria. In practice there exist internal evaluations (made, e.g. on the basis of studentТs surveys) and external ones. The significance of evaluations might be proved Ц among others Ц by the fact that the European University Association (EUA) has appointed a team whose task is to carry out evaluations on the European level as well as the fact that evaluation is often an initial stage of the accreditation process. The result of the evaluation could be either a report appraising the institution not necessarily positioning it against the background of the others, or a certification, i.e. awarding a certificate to the institution, e.g. ISO 9000 certificate. SWOT analysis constituting an analysis tool and the presentation of strategic position and competitive edge of an institution is also a specific form of evaluation.
Rating, predominantly characteristic for financial institutions evaluation entails their classification into a set group. For instance, in Finland university chairs are appraised in a scale from 7 to 1 (У7Ф is granted to the chairs classified as the ones included in 10% of the best in Europe). It is worth adding here that the chairs are aggregated into faculties, and the aggregated grades are a basis of distribution of financial means allocated from the state budget for scientific research. A similar rating was applied by Polish Scientific Research Committee (Komitet Badań Naukowych) with a scale 1-5. I would like to stress that while allocation of financial means not only Poland faces fundamental questions:
- Who should be a beneficiary: the best ones, or perhaps those who proved the biggest positive changes?
- Should Ц and if so how much Ц financial means be earmarked for the improvement in the situation of the weakest ones?
Accreditation, following the definition of a dictionary of foreign words, is an authorization to carry out an educational activity. The American Association of Collegiate School of Business, established in 1916, is the oldest accreditation Ц granting institution with the leading North-American universities having been its founders. Within 1916 Ц 1999 the institution awarded accreditation to as few as 374 programmes in the field of business administration and bookkeeping. The European equivalent (and competitor) to the AACSB is the accreditation granted by EQUIS (European Quality Improvement System). This accreditation is promoted by EFMD i.e. the European Foundation for Management Development. So far the EQUIS accreditation has been awarded to 73 institutions, in which 52 from Europe, 9 from Asia and Africa.
The accreditation itself is differently perceived and is of various forms. The three forms are characteristic for Poland:
The State Accreditation Commission (Państwowa Komisja Akredytacyjna) established within the Act on University Education amended a couple of years ago (in 2002). The fundamental role of this form of accreditation is to assess the quality of teaching process as well as to provide opinions concerning formal applications for creating new faculties within set university, creating the new faculty and establishing a new university. All the other roles Ц except for the accreditation itself Ц of the State Accreditation Commission are referred to as licensing.
University Ц environment commissions set up by public universities of various types with the University Accreditation Commission (Uniwersytecka Komisja Akredytacyjna) being the oldest and most experienced one. It is worth mentioning off the record that the commissions and their formal and subject Ц matter ways of operating reflect a wide range of ideas concerning public tertiary education.
Accreditation Commissions founded by physical persons or associations; i.e. FORUM Managerial Education Association Accreditation Commission (Komisja Akredytacyjna Stowarzyszenia Edukacji Menadżerskiej FORUM) and the Accreditation Commission of Associations of Rectors and Founders of Non-Public Universities (Komisja Akredytacyjna Stowarzyszenia Rektorów i Założycieli Uczelni Niepaństwowych).
Evaluations conducted by all other commissions than the State Accreditation Commission (PKA) are referred to as University Ц environment accreditation. The fundamental difference between the State Accreditation Commission (PKE) and other commissions is the fact that undergoing the evaluation procedure by university Ц environment commissions is optional and is initiated at the request of the interested university or its unit. A negative result of this kind of accreditation is usually not revealed to the public. In such a case the assumption is that a failure in being awarded a quality certificate does not mean a negative result of evaluation.
It is worth adding here that accreditation may concern both an institution (as it is the case of e.g. EQUIS accreditation) and syllabuses (e.g. accreditation granted by the State Accreditation Commission).
III. Criteria and diagnostic features of rankings.
Classifications of rankings as well as the impact of methodology applied while making them is in the centre of interest of this paper. Rankings could be classified considering the criteria applied in them, ways of measuring the criteria and the calculation method that leads to classifying the objects under analysis. The criteria are referred to by me as generally defined characteristics that should characterise the objects investigated. These properties might be of aggregate nature such as e.g. Уconditions of studiesФ described a few or several characteristic details. They could also be directly measurable properties such as e.g. Уprofessional success of graduatesФ (the УNewsweekФ ranking) measured by means of Уpercentage of graduates who have found a job in the companies examinedФ.
Variables (in other words: characteristics) that represent the criteria are detailed features constituting the basis for ranking-making process.
In general, a starting point in creating any kind of a ranking is a set of data characterising N objects with the help of S features, called diagnostic features. Let us make a general assumption that the subject to our analysis is N (n=1, ..., N) objects and each of them is described by S (s=1, ..., S) diagnostic features.
In other words, it could be said that we are analysing N variables and S information that has been gathered to conduct the analysis (a variable and a object might replace each other in analyses, a variable might be replaced by the feature value in a set object).
Let us assume as follows:
Xsn = be the s realisation of the n variable (in other words: a value of s diagnostics feature in n object).
The ranking of N high school based on admission to S universities (s Ц university number, n - high school number) in a set year. Following variants of evaluation might be therefore assumed:
a) Xsn Ц the number of candidates from n high school admitted to s university;
b) Xsn Ц the number of candidates from n high school admitted to s university in relation to graduates of the high school;
c) Xsn Ц the number of candidates from n high school admitted to s university in relation to the total number of candidates taking entry tests to the university;
d) Xsn Ц the number of candidates from n high school to s university in relation to the number of graduates of the high school.
The ranking of N universities characterised with S indicators (s Ц quality characteristic feature, n Ц university). In this particular case Xsn is the s value of indicator typical for n university.
Both qualitative and quantitative features could be subject to investigation. Quantitative are those that allow measuring their value and presenting it in numbers. Those could be of absolute value, e.g. number of Ph.D. faculty members employed full-time or number of volumes in university library, or they might be of relative value as, for instance, number of candidates in relation to the admission to university in a recruitment process or number of foreign students in proportion to the overall students number.
On the other hand, qualitative features are those that enable stating a fact, and their values are of logical nature: yes or no (and more generally, whether something exists or is non-existent, whether it has got or has not got, e.g. having or not having an accreditation, running or not running a future career bureau, etc.). In accordance with a strict definition a quality feature is an unmeasurable property the set variation of which is an integral part or does not exist at all within a set community unit, given the fact that qualitative features could be both dychotomic (bipartite) or politomic (multipartite). Sometime it is possible to come across ordinance features (quasi-qualitative), i.e. the ones that express (similar to qualitative features) the magnitute of the investigated feature in the object under examination. УContacts with local communityФ evaluated within the scale: weak, average, strong could serve here as an example .
It is also worth differentiating such variables that are of stream-like character and such ones that are of resource nature. The former ones stand for a number of a professor title awarded to faculty members within a year time, the latter ones could be, e.g. the total number of university professors employed by university on a set day (most often 31 December).
Among the data to be used while making rankings the data concerning the current assessment (situation) of university are considered as a rule. However, in some cases authors of the ranking take into account a projected (expected) future of the university under examination.
For instance, forecasts concerning job opportunities in investigated companies for graduates of the university might be an essential ranking element. Such a criterion is applied, among others, in rankings of УThe Wall Street JournalФ (a survey question concerns the next two years), УPerspektywyФ (Polish weekly) and УRzeczpospolitaФ (Polish daily) (in this case the future is not clearly determined).
From among the examined features stimulants, destimulants, and nominants could be differentiated. Stimulants are all those whose high value is linked to a positive evaluation of the object examined. As an example could serve here a number of autonomous faculty members from among the total number of all research and teaching university academic staff. On the other hand, destimulants include all the features whose low value results in a positive evaluation. An example of a destimulant is a ratio between number of students and faculty members level. A nominant is a feature typical for an object that is characterised by a feature approaching a defined value. In case of making use of a feature the type of which is hard to determine (e.g. in the УPolitykaФ ranking a question of a survey addressed to universities concerning a number of students expelled from university was asked), methods allowing to formally differentiate a stimulant from a destimulant could be applicable (appropriate procedures are discussed by Grabiński). Methods discussed in literature all enable transforming observations in such a way that would make objects comparable (e.g. multiplying of a destimulated value by Ц or using it while calculating their reserve number).
Sources of information and consequently the data being a basis of attributing values to features by selected methods thus providing value to the criteria fall into two categories: egzogenous and endogenous concerning the investigated institutions. For example, the data used while making a ranking of MBA programmes and published by УThe Wall Street JournalФ are egzogenous ones. They have been collected from over 200 thousand employment agencies (more than 3,500 programs Ц business school have been evaluated) and according to the authors of the ranking, universities did not have any influence on the choice of employment agencies involved in evaluation process. On the other hand, the data obtained from surveys carried out at universities are endogenous ones. Such kinds of surveys are commonly used in Polish rankings (УPerspektywyФ, УPolitykaФ, УWprostФ weeklies), yet, their significance is differently perceived. In УPolitykaФ in the authorsТ opinion it is the surveys that provide major source of the data. The data obtained from the surveys are subject to further verification (rankings are not amended with any other information) on the basis of publications of Ministry of Education and Sports, Ministry of Science and Information Technology (MENiS, MNiI).
In case of the rankings of УPerspektywyФ weekly and УRzeczpospolitaФ daily the main data are egzogenous ones (opinions of both employers and faculty members), some additional data come from publications of Ministry of Education and Sports, Ministry of Science and Information Technology, Central Statistical Office. The endogenous data obtained from surveys conducted at universities are treated as a kind of supplement. Some authors of the rankings present a sceptical approach to endogenous data, particularly verified. Therefore, their influence upon the findings is minimised. That is the reason why in a German ranking (see: Federkeil) a part of data are categorised as Уhard factsФ and the other part as Уsubjective opinionsФ.
Specific kind of data is provided by surveys conducted by УRzczpospolitaФ daily in 2000, whereas УPolitykaФ weekly treated such surveys (conducted within 2000-2001) as a supplement to the ranking. However, such data are regularly used by Asaki Shimbun (see: Yonezawa) provided they are carried out every two or three years.
It is УUS News and World ReportФ that evaluates studentsТ satisfaction from their studies in a unique way: what is taken into consideration is a percentage of students who provided donations to their universities as well as a percentage of the newly admitted to university who graduated from it within a set number of years.
As it is the case in making rankings the basis for calculations are not directly observed values of statistical data since such data could not be directly compared and aggregated. Given the above, within the process of making a ranking it is necessary to:
- define the way of unifying all the variables so that they could be comparable;
- define an analytical from of an aggregating function.
A way of unifying the data could be, e.g.:
- ranking, the idea of witch is replacement of the value of diagnostic features with their ranks resulting from a position of a set object in draft single-criteria rankings
(see: Rocki, УRanking uczelni wyższych in 2001),
quotient transformation in which the data are presented in a proportion to an average, minimum or maximum value. Such a method is applied in university rankings made by УPerspektywyФ and УRzeczpospolitaФ as well as УPolitykaФ. It is also included in the Directive of Russian Federation Minister of Education on ranking from 26 February 2001 (see: Filinov). The method is as follows:
* i.e. establishing a value of features at university that is characterised by a maximum (different than zero), best possible feature value as a unit, and in reaching ones in relation of its indicator, or
* establishing the value of features at university that is characterised by a minimum (different than zero), best possible feature, and in remaining ones in relation to the value of its indicator, or
* standarisation, the idea of which is such a transformation of the values observed so that their average amounted to zero, and standard deviation equalled zero,
* unitarisation, the idea of which is to transform primary values of variables into relative retaining a constant, unity range of variation
* such a transformation is made in the ranking of Russian Federation universities published by УCareer JournalФ (see: Filinov) isotonic transformation (see: Pluta) in which primary data are transformed in the following way:
this means that the rsn value constitutes a share of s feature of n object features. As a result of such a transformation values of observations are obtained and what is typical for them is that the length of a vector of observation for separate objects (object = variable) is equal to one (the sum of values transformed is a unit for a set object).
Example 1 (continued).
The ranking of N universities based on recruitment to S universities (s Ц number of university, n Ц number of high school) in a set year. As it could be easily see, the izotonic transformation results in obtaining rsn values for each variant of evaluation of high school within the recruitment procedure (example I). The values are derived from a comparison between the number of graduates of n high school admitted to any kind of studies.
In this case the sum of the values of transformed observations is equal to unit (I) because the transformed values are a part of s variable (i.e. the candidates admitted to s university) in n object (among candidates admitted to s university) in n object (among candidates from n high school). In other words, an observation vector concerning a set n high school should be interpreted as a structure of the admitted against the background of all universities regardless of the number of candidates from this high school.
Apparently if none high school leaver of a set high school has been admitted to university within the number of universities being surveyed, such a high school is out of very classification.
Example 2 (continued).
In case of a ranking of N universities based in S features (s Ц number of a quality feature, n Ц university): xsn Ц is s value of the indicator for n university, therefore (considering the izotonic transformation) rsn Ц is a share of s value of this feature (s quality indicator) within the characteristics of n university. Depending on the kind (definition) of diagnostics features the transformed values could, but they do not have to have reasonable interpretations. In Example 1.: the sum of the admitted to separate leaver, thus izotonically transformed values come down to unit (I). In Example 2.: the sum does not have an obvious interpretation.
It is worth adding that in izotonic transformation similar objects are such ones that are of similar structure. Referring to Example 1.: similarity of high school is divided from identical structure of the admissions to separate universities.
The research that is targeted at obtaining sub-sets of data consisting of elements similar in terms of the level of value of variations is called an izotonic research. A vector of ns components constituting rsn sums after n:
is called an izotonic meter (see: Pluta). This meter provides information about only one component of the value of variables, namely about the component reflecting the УrankФ typical for a set element in the set of objects under review. In such a case, classification of objects according to not decreasing values of the meter is the result of the ranking.
IV. Aggregation of criteria and features.
Metodology of aggregating variables in order to classify the object (institution) under examination is a separate problem. In my opinion, a ranking in a full meaning of the word is such a ranking in which objects are classified along with the analysis of varied variables (features) consisting criteria typical for differentiation of properties of the evaluated (compared) objects. A necessity of defining a synthetic measurement enabling classification of objects seems to be a basic formal problem which results from the very nature of a multi-criteria ranking and the existence of numerous features of the objects examined. This synthetic measurement is to be a measurable representative of a directly unobservable metafeature (metacriterion) being a basis for searching the Уbest possible objectФ. In case of a multi-criteria ranking there emerges a problem of aggregating the information including many features in order to establish one, synthetic measurement. It will enable classification of objects (from formal point of view: maximisation of weighed average of indicators from multi-criteria evaluation).
The most frequently applied method while making a ranking (УPolitykaФ, УPerspektywyФ, УRzeczpospolitaФ, УWprostФ) is based on a simple procedure of adding up a converted values of features by means of an a priori (subjectively) assumed system of weighing. Therefore, it is crucial for the values (i.e. the ones that could be added up). An alternative for the method that consist in adding up the weighed values of standarised features could be a method enabling categorizing the objects without the necessity of constructing a synthetic feature (see: Nykowski). It is also possible to create a synthetic feature (aggregated diagnostic variable), using the method of major elements or methodology of soft modeling (see: Joreskog, Wold, Rogowski and the ranking of Warsaw high schools of УPerspektywyФ, each successive year within 1997-2004).
The fundamental classification of rankings is bound to differentiate, firstly, the actual single-criterion rankings, secondly, pseudo-single-criterion rankings, in which a multi-criteria evaluation has been replaced by an ordinal metacriterion considering single elements of a multi-criterion evaluation, as well as thirdly, the actual multi-criterion rankings. From formal point of view a single-criterion making is a result of sorting out a list of objects according an assigned criterion. The so called lexicographic (Уdictionary-likeФ) alphabetical order is a basis for such a ranking. As examples of a single-crterion ranking could serve: ranking of majors of studies made by УEurostudentФ monthly (November 1999), ranking of French high schools by УLТevenement du JeudiФ weekly (e.g. 27 February 1997), ranking of universities by УNewsweekФ (among others issue no. 27, 2004) as well as the ranking of financial institutions by УGazeta BankowaФ (e.g. 24 May 2004). In these rankings (very interesting ones for the readers) the following factors were considered:
- number of a candidates in proportion to the admission limit in a selected year (УEurostudentФ);
- difference between the projected (on the basis of regulations issued by educational authorities) and the rue number of successfully passed high school final exams (УLТEvenement du JeudiФ);
- number of graduates of university under review employed in managerial posts in proportion to the total number of its graduates within the time under the ranking (УNewsweekФ of 6 April 2003, 21 March 2004). This ranking is seemingly a multi-criterion one since it added up three numbers and subjectively defined weights, but de facto they measure one feature: (how many graduates have got a well-paid jobФ (see:ФNewsweekФ, Tomasz Wróblewski, 2002);
- assets (УGazeta BankowaФ Ц ranking of 200 targest financial firms); although, which is interesting the ranking other, additional features are analysed as well.
In case of Polish rankings in which quality metacriterion is used, some other criteria could be differentiated.
They are as follows:
- conditions of studying (УPerspektywyФ), student as a ranking target (УPolitykaФ), social conditions of studying (УWprostФ);
- prestige and potential of faculty members (УPerspektywyФ), position among other universities (УPolitykaФ), intellectual background(УWprostФ).
On the other hand, among diagnostic features the following egzogenous features are mentioned by the authors:
- category of the Committee of Scientific Research;
- number of award honourТs titles and honourТs degrees or the number of Ph.D. and Ph.D.(habilitation) procedures conducted;
- number of points scored in surveys filled in by employers and faculty members;
- holding an accreditation;
- share of M.A. students in the overall number of full-time students;
- university status license;
- number of candidates in proportion to the admission limit;
- number of full-time students in proportion to the overall number of students;
- accommodation capacity of dormitories in proposition to the number of students;
- expectations concerning job opportunities of graduates of a set university.
Endogenous features are as follows:
- number of grants awarded;
- number of publications;
- resourcefulness of libraries measured by number of volumes and magazines subscribed;
- number of scientific associations and student organizations;
- activity of career bureau;
- extent of library computer networking.
The above mentioned diagnostic features are taken into consideration in practice (apparently with different weighing) as evaluation measures to various criteria. For instance, KBN category in УPolitykaФ ranking measures Уuniversity positionФ with the weight lower than 25% (the 25% is attributed to the total of 26 indicators, authors do not reveal details), and in УPerspektywyФ it is a measurement of Уintellectual potentialФ, and in the final ranking it has the weight of 10%.
Variety of the above discussed elements making up methodology of a ranking and thus criteria, features allowing their measurement, methods of aggretation with focus on weights and finally ways of classification unanimously prove that an objective ranking could not exist. Even if all the postulates of various professional groups are considered in regard to the criteria applied, the final result is depend upon of the weight system adopted as well as the method of data standarisation. And therefore each year rankings are triggering heated debates. Public discussions or in-house editorial debates are held (popularized by УPolitykaФ) as well as seminars are organized (see: publications of the Institute of Contemporary Civilisation Problems). It is sometimes the case that the results of a ranking make publically evaluated institutions boycott the ranking results (the boycott of the ranking declared by some high schools evaluated in Warsaw high schools ranking compelled its authors to abandon the application of endogenious data).
There is no clear and unanimous answer to the question whether rankings have only influence on behaviour patterns of those who they are targeted at (see: Eccles). This might result from the fact that rankings are relatively УyoungФ against the background of many-century long university tradition. However, the emergence of rankings results from profound and qualitative changes in systems of education and therefore, their further development could be anticipated.
The following statement seems to best conclude: each ranking in subjective opinion of its authors is objective and that is why the authors Ц to preserve the reliability of their involvement Ц have to be able to prove trustworthiness, impartiality and transparency of the methodology applied.
1) Eccles Charles ДThe Use of University Rankings in the United KongdomФ w УRanking and league tables of higher education institutionsФ UNESCO Ц CEPES, European Centre for Higher Education, vol.XXVII, no.4.,2002
2) ДEdukacja EkonomicznaФ, praca zbiorowa pod red. M.Rockiego, VII Kongres Ekonomistów Polskich, 2001
3) Grabiński Tadeusz, ДWielowymiarowa analiza porównawcza w badaniach dynamiki zjawisk ekonomicznychФ, Zeszyty Naukowe AE w Krakowie, nr 61, 1984
4) Jőreskog K.G., Wold H., УSystems under indirect observations. Causality-Structure-PredictionФ, North-Holland Publishing Company, Amsterdam, 1982
5) Federkeil Gero УSome Aspects of Ranking Methodology Ц The CHE Ranking of German UniversitiesФ w УRanking and league tables of higher education institutionsФ UNESCO Ц CEPES, European Centre for Higher Education, vol.XXVII, no.4.,2002
6) Filinov N.B. I Ruchkina S. УRanking of Higher Education Institutions I RussiaФ w УRanking and league tables of higher education institutionsФ UNESCO Ц CEPES, European Centre for Higher Education, vol.XXVII, no.4.,2002
7) Instytut Problemów Współczesnej Cywilizacji, ДAutorytet uczelniФ, 2002
8) Instytut Problemów Współczesnej Cywilizacji, ДJakość kształcenia i akredytacja w szkolnictwie wyższym w PolsceФ, 2002
9) Nykowski Ireneusz, ДO rankingach skończonego zbioru obiektów ocenianych wielokryterialnieФ, Akademia Ekonomiczna w Krakowie, RectorТs Lectures, No.49, 2001
10) Pluta Wiesław, ДWielowymiarowa analiza porównawcza w modelowaniu ekonometrycznymФ, Biblioteka Ekonometryczna, PWN, 1986
11) Pociecha Józef, ДMetody statystyczne w badaniach marketingowychФ, PWN, 1996
12) Rocki Marek, ДRanking liceów w rekrutacji do SGHФ, Informator dla kandydatów na studia, SGH, 2001
13) Rogowski Józef, ДKilka uwag o ДmiękkimФ modelowaniu ekonometrycznymФ, Przegląd Statystyczny, nr 4., 1986
14) Wróblewski Tomasz, arytuł wstępny do rankingu uczelni, Newsweek, 31 marca 2002
15) Yonezawa Akiyoshi, Nakatsiu Izumi, Kobayashi Tetsuo ДUniversity Rankings in JapanФ w УRanking and league tables of higher education institutionsФ UNESCO Ц CEPES, European Centre for Higher Education, vol.XXVII, no.4.,2002