WO2006117498A2

WO2006117498A2 - Dynamic assessment system

Info

Publication number: WO2006117498A2
Application number: PCT/GB2005/001667
Authority: WO
Inventors: Steven Gordon William Cushing; Michael Peppiatt
Original assignee: Steven Gordon William Cushing; Michael Peppiatt
Priority date: 2005-05-04
Filing date: 2005-05-04
Publication date: 2006-11-09

Description

DYNAMIC ASSESSMENT SYSTEM

This invention relates to a method of assessment.

Assessing participants in a number of different environments is, of course, well known. On a simple level, a school pupil is tested throughout his education to assess his knowledge and understanding. At a more complex level, a job applicant may be tested to assess the sort of job he is suited to. Within the work place, team building exercises may comprise testing of employees to assess their strengths and weaknesses and how they relate to others in that workplace.

Some of the more straightforward assessments will simply require a "level" at output e.g. grade B in a given subject area. However, some more complex assessments will require more than just a "level" at the output, instead requiring a set of conclusions and perhaps recommendations as to further action.

Most of the assessments currently available are termed static assessments. Static assessments comprise a number of questions or tasks through which the user works, an output being given at the end of the question or task set. Clearly, in this type of assessment, there is no room for modification according to the user e.g. a high level user works through the same question or task set as a low level user. There is only one pathway through the assessment (from the first question or task to the last question or task) .

An improvement on static assessments is known and such assessments are often termed adaptive assessments. Adaptive assessments comprise a number of predetermined pathways through the assessments. A particular pathway may be selected at a decision or trigger point. For example, a yes/no question may be asked; if the answer is yes, the test proceeds to next question A; if the answer is no, the test proceeds to next question B. Thus, when the user answers the question, there is a trigger point selecting one of two pathways. There may be several trigger points in the test, each selecting one pathway from a selection of predetermined pathways .

In an adaptive assessment, although there is some modification of the assessment pathway in response to different users, the modification is limited as each pathway is one of a predetermined set. Thus, there are inevitably more types of user taking the assessment than there are predetermined pathways through it. So, for at least some users, the pathway is time inefficient as the questions or tasks are not the most effective way to gather data on a particular user.

In addition, an adaptive assessment takes no account of user data to select a pathway, so the results are naturally less accurate. For example, if the user were an expert mathematician, it would only take a small number of high level, detailed mathematics questions to provide fast and highly accurate results. An adaptive assessment would not take account of this and the user would be required to take all levels of the assessment via one of a number of predetermined pathways, resulting in slow and not very accurate results.

It is an object of the invention to provide a method of assessment which eliminates or mitigates the problems associated with known assessment methods, described above. According to the invention, there is provided a method of assessing a user comprising the steps of: a. computer means processing existing user data to determine a test starting point; b. computer means selecting at least one question and/or task from a database of questions and/or tasks; c. computer means putting the selected question (s) and/or task(s) to the user; d. computer means receiving a response from the user; e. computer means processing the user's response to generate a user result; f . repeating steps b, c, d and e until the computer means can generate a user outcome, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and g. computer means reporting the user outcome.

The question or task of step b will of course be chosen in accordance with the test starting point determined in step a. The test starting point may for example be a question or task or a group of questions or tasks from which a question or task is selected.

In contrast to the prior art static and adaptive approaches, in this dynamic approach elements of the assessment are adapted, where necessary, to reflect current user data and circumstances. That is not possible in a paper-based assessment model.

A dynamic test is distinct from an adaptive test because an adaptive test consists of a series of predetermined pathways with decision trigger points. An adaptive test may appear to be dynamic but is in fact a series of distinct and discrete fixed-path tests; an adaptive test is, in that sense, static rather than dynamic. A dynamic test, on the other hand, may use information from its embedded marking rules, sufficiency thresholds, trigger point decisions and the curriculum (any of which in themselves may be dynamic or static) . Compound multi-dimensional complex "marking" arrays may be used, integrating one to many, many to one and many to many relationships (where the term "marking" is used in context of logic structure decisions and/or sufficiency thresholds and/or marking rules across levels of progression through to the curriculum and/or cross curricula) .

Thus the starting point of the assessment is determined by processing existing user data (where such data exists) and the route that the user takes through the test depends on user responses (which may be the responses of the user or may comprise the responses of a plurality of users) . The question (s) and/or task(s) selected at each selecting step is dependent on the previous user result or results: it may be dependent on the previous response or responses of the user; it may be dependent on a cluster of previous responses or a sufficiency rule (for example, it may decide that sufficient evidence has been gathered in respect of a first subject, so that the next selection moves on to a second subject) .

User responses may be used to modify pre-conceived rules or may be used to determine rules . The system may learn from responses given (which is especially relevant in a process- driven assessment, where alternative methods may be deployed to arrive at a desired outcome) .

Repeating steps b, c, d and e, whilst selecting appropriate questions or tasks based on previous user result (s) means the pathway selected through the assessment is tailor-made to a particular user. Thus, the assessment is time efficient and is able to provide a useful outcome much more quickly than, say, an adaptive assessment, which simply selects the pathway- through the assessment from a set of predetermined pathways. The assessment is also more accurate as the data gathered is appropriate to the particular user.

By processing existing user data to determine a test starting point, the assessment can provide highly accurate results in a short time. The existing user data which is processed is that which is appropriate to the particular assessment. Thus, in most many cases not all existing user data will be processed; only that which is relevant to the assessment will be selected. Thus, the assessment has improved accuracy (as the existing user data can be used to go straight to relevant questions or tasks for a given user) and is time efficient (as the existing user data may prevent useless questions or tasks being asked or set at the outset of the assessment and/or throughout the assessment) . Thus the method may further comprise computer means processing the existing user data as well as the user's response to generate the user result. The existing user data may be from one user (the present user) or from a plurality of users.

In a preferred embodiment, the computer means comprises host computer means and user computer means. The host computer means may perform one or all of steps a, b, c, d, e and g. The user computer means may perform one or all of steps a, b, c, d, e and g.

The user outcome generated by the computer means may take a variety of forms. In one example, the computer means may be able to generate a highly detailed user outcome. In another example, the computer means may be unable to generate a detailed user outcome because the user responses are inadequate. (In that case, it may be appropriate for the user outcome to simply be an indication that the results are inadequate.) Alternatively, the computer means may be programmed to generate a user outcome after a given time of testing, in which case the detail and usefulness of the user outcome will depend on the user. The computer means uses the user results to determine an appropriate time and form of user outcome.

The user outcome may be a recommendation for the user to undertake a static or adaptive test. More preferably, the user outcome itself may be in the form of a static or adaptive test.

The computer means may put the selected question or task to the user either directly or indirectly. If the computer means puts the selected question or task indirectly to the user, this may be via another user or via a further computer means. Similarly, the computer means may receive the user' s response either directly or indirectly. If the computer means receives the user' s response indirectly, this may be via another user or via a further computer means.

The existing user data may comprise historical data. The existing user data may comprise existing ability data. The existing user data may comprise attainment data.

Preferably, the existing user data comprises a user outcome or outcomes from previous assessments. By using previous user outcomes, the assessment improves even more in accuracy by improving the selection of test starting point as well as the selection of questions and/or tasks. In one embodiment of the invention, step a further comprises computer means processing existing user data to determine a test delivery method.

In an embodiment of the invention, the computer means determines the test starting point via at least one algorithm from a database of algorithms. In an embodiment of the invention, the computer means selects the question (s) and/or task is from the database of questions and/or tasks via at least one algorithm from a database of algorithms. As used herein, the word "algorithm" means any suitable algorithm or rule .

The computer means may use either evidence or inference to determine the test starting point. The computer means may use either evidence or inference to select a question (s) and/or task(s) from the database of questions and/or tasks. In an embodiment of the invention, the computer means comprises an inferential engine, which may determine the test starting point and/or select questions or tasks.

The aforementioned databases of algorithms may comprise at least some algorithms whose steps are pre-determined and/or at least some algorithms whose steps are dependent on the existing user data and/or the user result or results and/or at least some steps which are pre-determined and at least some steps which are dependent on the existing user data and/or the user result or results .

Those algorithms whose steps are entirely predetermined are static algorithms. Those algorithms whose steps are dependent on the existing user data and/or the user result or results are either partially or wholly dynamic. However, because the selection of algorithms (whether those algorithms are static or dynamic) is preferably not predetermined, but will instead depend on the user, even a set of static algorithms can form a dynamic set .

The algorithms in the databases may originate from previous assessments, including both static and dynamic assessments.

The aforementioned databases of algorithms may be stored on the computer means . Alternatively, the databases of algorithms may be stored on a record carrier for use with the computer means, for example a floppy disc or CD-Rom.

In an embodiment of the invention, the user result is generated via at least one marking algorithm from a database of marking algorithms. In an embodiment of the invention, the user outcome is generated via at least one marking algorithm from a database of marking algorithms .

It will be understood that use of the word "marking" does not necessarily imply a traditional marking process in which the result or outcome is a simple mark or grade. In fact, the result or outcome may be a suggestion for future assessment pathways, a recommendation for user assistance or similar.

The aforementioned databases of marking algorithms may comprise at least some marking algorithms whose steps are pre-determined and/or at least some marking algorithms whose steps are dependent on the existing user data and/or the user's response and/or the user result or results and/or at least some marking algorithms having at least some steps which are pre-determined and at least some steps which are dependent on the existing user data and/or the user's response and/or the user result or results . Those marking algorithms whose steps are entirely predetermined are static marking algorithms. Those marking algorithms whose steps are dependent on the existing user data and/or the user result or results are either partially or wholly dynamic. However, because the selection of marking algorithms (whether those marking algorithms are static or dynamic) is preferably not predetermined, but will instead depend on the user, even a set of static marking algorithms can form a dynamic set.

The marking algorithms in the databases may originate from previous assessments, including both static and dynamic assessments .

Optionally, the database of marking algorithms is stored on the computer means. Alternatively, the database of marking algorithms is stored on a record carrier for use with the computer means, for example a floppy disc or CD-Rom.

Preferably, the marking algorithms are periodically moderated and new algorithms may be created or existing algorithms may be duplicated and updated or simply updated. The moderation may be based on user performance and may be entirely automatic, semi-automatic or entirely human. Human moderation may be via input directly or indirectly into input means connected to the database of marking algorithms .

Preferably, the database of questions and tasks includes questions and tasks with varying content, context, structure and language. It should be understood that the word "task" is used throughout to mean any activity used to elicit data from the user. Advantageously, the database of questions and tasks includes questions and tasks covering one or more subject areas and/or psychometric and/or motor metric domains. Preferably, the database of questions and tasks includes questions and tasks at a plurality of difficulty levels. The questions and tasks in the database may originate from previous assessments.

Preferably, the questions and/or tasks in the database of questions and/or tasks include links to other questions and/or tasks in the database to facilitate question and task selection. Those links may be between different difficulty levels and subject areas.

In an embodiment of the invention, the database of questions and/or tasks has been previously generated by the computer means. In an alternative embodiment of the invention, the database of questions and/or tasks has been previously generated by a human assessment developer. In either case, the computer means may generate additional questions and/or tasks to use in the assessment and/or to add to the database of questions and/or tasks, throughout the assessment.

The database of questions and/or tasks may be stored on the computer means . Alternatively, the database of questions and/or tasks may be stored on a record carrier for use with the computer means, for example a floppy disc or CD-Rom.

In an embodiment of the invention, step d of computer means receiving the user' s response further comprises the computer means receiving auxiliary information. In one example, that auxiliary information may comprise the time taken for the user's response. In many cases, the user's response combined with the auxiliary information is more useful than the user' s response alone. Such auxiliary information may, optionally, be stored in the database of questions and/or tasks alongside the relevant question or task. The user outcome reported by the computer means may comprise a summative report and/or a formative report and/or a progressive report. The user outcome reported by the computer means may be reported to the user (either directly or indirectly) or to an assessment developer or assessment agency (either directly or indirectly) .

According to the invention, there is also provided computer means programmed to be operable to carry out a method as described above.

According to the invention, there is also provided a method of assessing a user comprising the steps of:

a. host computer means processing any appropriate existing user data to determine a test starting point and a test delivery method; b. host computer means selecting at least one question and/or task from a database of questions and/or tasks; c. user computer means putting the selected question (s) and/or task(s) to the user; d. user computer means receiving the user's response. e. host computer means processing the user's response and, optionally, the existing user data, to generate a user result; f. repeating steps b, c, d and e until the host computer means can generate a user outcome, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and g. user computer means reporting the user outcome. According to the invention, there is also provided user computer means programmed to be operable to carry out a method as described above.

Further, according to the invention, there is provided computer means comprising host computer means as described above and user computer means as described above.

According to the invention, there is further provided a computer program including code portions which, when executed by computer means, cause that computer means to carry out the steps of a method as described above.

According to the invention, there is further provided a record carrier having recorded thereon information indicative of a computer program as described above.

According to the invention, there is also provided a signal carrying information indicative of a computer program as described above.

According to the invention, there is also provided a method of constructing a user assessment database comprising the steps of: a. computer means selecting assessment data from a first database of assessment data, the assessment data selected being appropriate to the particular user assessment being constructed; b. computer means modifying at least some of the selected assessment data; and c. computer means inputting the selected assessment data, modified and unmodified, into a second database of assessment data. In an embodiment of the invention, the assessment data in the first database of assessment data have been at least partially generated by the computer means. Optionally, the assessment data in the first database of assessment data have been at least partially generated by a human assessment developer.

The assessment data may comprise questions and/or tasks. The assessment data may comprise user responses. The assessment data may comprise marking schemes. The assessment data may comprise assessment items. The assessment data may comprise auxiliary data e.g. time data connected to a user response.

Preferably, the method further comprises step d of computer means generating additional assessment data and inputting the generated assessment data into the second database of assessment data.

The first database of assessment data may be stored on the computer means. Alternatively, the first database of assessment data may be stored on a record carrier for use with the computer means .

The second database of assessment data may be stored on the computer means. Alternatively, the second database of assessment data may be stored on a record carrier for use with the computer means. The second database may be separate from the first database or it may be an updated version of the first database.

According to the invention, there is also provided a method of assessing a user comprising the steps of: a. computer means selecting at least one question and/or task from a database of questions and/or tasks; b. computer means putting the selected question (s) and/or task(s) to the user; c. computer means receiving the user's response; d. computer means processing the user's response to generate a user result; e. repeating steps a, b, c and d until the computer means can generate a user outcome, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and f. computer means reporting the user outcome.

According to the invention, there is also provided a method of awarding a level to a user who has been assessed comprising : a. computer means receiving a plurality of user outputs from user assessments; b. computer means setting preliminary level boundaries based on the user outputs; c. computer means checking and confirming preliminary level boundaries; d. computer means checking at least some user outputs; and e. computer means awarding a level to the user based on the user's user output and the level boundaries.

According to the invention, there is also provided a method of assessing a user comprising the steps of: a. processing existing user data to determine a test starting point; b. selecting at least one question and/or task from a database of questions and/or tasks; c. putting the selected question (s) and/or task(s) to the user; d. receiving the user's response; e. processing the user's response and, optionally, the existing user data to generate a user result; f. repeating steps b, c, d and e until an appropriate user outcome can be generated, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and g. reporting the user outcome.

Preferably, the existing user data comprises an outcome or outcomes from previous assessments.

The test starting point may be determined via at least one algorithm from a database of algorithms . The question or task may be selected from the database of questions and/or tasks via at least one algorithm from a database of algorithms. The aforementioned databases of algorithms may comprise at least some algorithms whose steps are pre-determined and/or at least some algorithms whose steps are dependent on the existing user data and/or the. user result or results and/or at least some algorithms having at least some steps which are pre-determined and at least some steps which are dependent on the existing user data and/or the user result or results.

Advantageously, the user result is generated via at least one marking algorithm from a database of marking algorithms. Preferably, the user outcome is generated via at least one marking algorithm from a database of marking algorithms .

The aforementioned databases of marking algorithms may comprise at least some marking algorithms whose steps are pre-determined (i.e. static algorithms) and/or at least some marking algorithms whose steps are dependent on the existing user data and/or the user's response and/or the user result or results (i.e. wholly dynamic algorithms) and/or at least some marking algorithms having at least some steps which are pre-determined and at least some steps which are dependent on the existing user data and/or the user's response and/or the user result or results (i.e. partially dynamic algorithms).

Preferably, the marking algorithms are periodically moderated.

In an embodiment of the invention, the database of questions and tasks includes questions and tasks with varying content, context, structure and language. Advantageously, the database of questions and tasks includes questions and tasks covering one or more subject areas and/or psychometric and/or motor metric domains . Preferably, the database of questions and tasks includes questions and tasks at a plurality of difficulty levels.

The questions and tasks in the database of questions and tasks preferably include links to other questions and/or tasks in the database to facilitate question and task selection.

In an embodiment of the invention, step d of receiving the user's response further comprises receiving auxiliary information. That auxiliary information may comprise the time taken for the user's response.

The user outcome may comprise a summative report and/or a formative report and/or a progressive report. It will be understood that any feature of the invention described in relation to one aspect of the invention may be applied, where appropriate, to another aspect of the invention .

An embodiment of the invention will now be described with reference to the accompanying drawings, of which:

Figure 1 is a flow diagram showing the whole assessment process;

Figure 2 is a schematic diagram showing the rules structure; Figure 3 is a schematic diagram showing the evidences matrix; Figure 4 is a flow diagram showing a method of test construction; Figure 5 is a flow diagram showing development of marking algorithms; Figure 6 is a flow diagram schematically showing the

Testing process; Figure 7 is a flow diagram schematically showing the

Marking & Output process; Figure 8 is a diagram of a system for implementing the

Testing step of Figure 6 and the Marking &

Output step of Figure 7;

Figure 9 is a flow diagram showing moderation of the marking process; and Figure 10 is a flow diagram showing the process of marking to grading.

As will be seen from Figure 1, the overall process involves three main process steps:

1) Psychometric and motor metric profiling 101.

2) Dynamically adaptive assessment 103 comprising Testing 107 and Marking & Output 109. 3) Diagnostic outcomes 105.

Psychometric and motor metric profiling: Psychometric and motor metric profiling 101 involves using existing information relevant to the assessment process. The requirements for this step will depend on the particular nature of that assessment, e.g. the output requirements and the purpose of the assessment. As can be seen in Figure 1, there are five main data areas used for the first step of psychometric or motor metric profiling, some or all of which may be used.

The first data area is historical information 111, which comprises existing information and any personal data that is relevant to the assessment concerned. As a first example, in the case where the user is a school pupil, the historical data may include any personal information relevant to the pupil's learning (e.g. recent illness) . In a second example, where the user is an employee or interview candidate, historical data may include data on previous work experience and/or in-house assessments. In a third example, where the user is being clinically assessed, the historical data may include data from previous paper or other off-screen assessment instruments. The historical data may also include brain activity data e.g. a CAT Scan output.

The second data area is data on existing ability 113. These data may include information on the user's numerical, spatial, verbal and non-verbal reasoning, the user's IQ the user's preferred learning style (s) and/or the user's personality type. Those data may also include any available special needs data or psychometric data or motor metric data. Those data may be obtained from previous assessments or from indicative tests.

The third data area is data on attainments 115. In the first example, where the user is a school pupil those data will include previous examination results and educational attainment. In the second example, where the user is an employee or interview candidate, those data will include school examination results and educational attainment as well as any higher education results and other relevant skills. In the third example, where the user is being clinically assessed, those data may include motor skills mastered.

The fourth data area is any other relevant auxiliary data 117 particular to the user.

The fifth data area is diagnostic assessment 105 resulting from any previous assessment, and outputs from the marking and output stage of any previous assessment i.e. feeding back the diagnostic outcomes from the third stage of a previous assessment into the first stage of the next assessment and/or feeding back the outputs from the second stage of a previous assessment into the first stage of the next assessment. This improves the psychometric profiling and hence results in more accurate future outputs and diagnostic outcomes.

The result or outcome or performance data from a given assessment may be fed back into diagnostic assessment data from previous assessments in order to update the data set. This is known as dynamic statistical analysis. As assessees' results are returned, during or after an assessment, those results are fed into a database from which algorithms are used to determine statistical outcomes which determine the future use of the item or task, e.g. the level of difficulty of the item or task for any relevant cohort (e.g. an age cohort) . For example, if a set of user data resulted in a Gaussian distribution of results centred on 60% for a given assessment, new user data when fed into that data set, may have the result of changing the data set e.g. to a Gaussian distribution centred on 59%. And even in less straightforward data sets, dynamic statistical analysis may be performed to update the existing data.

Dynamically adaptive assessment: This is the core area of the overall process. The assessment involves two steps: Testing 107 and Marking & Output 109. In order to describe the process, a number of terms need now to be described.

• Curriculum: The curriculum 201 (see Figure 2) is any area of study or competencies. The curriculum may be dynamic and it may take the form of a set of study units which can be swapped in and out of a user's individual learning scheme. The curriculum defines the domain of the assessment. In the first example where the user is a school pupil, the curriculum may include units of a number of different school subjects. The curriculum could alternatively take the form of non-subject specific units which alter in response to user results or outcomes. In the third example where the user is being clinically assessed, the curriculum may comprise any mix of psychometric or motor metric domains .

• The rules structure: As seen in Figure 2, the rules structure comprises rules 203, attainments 205, elaborations 207, macro evidences 209 and micro evidences 211, each of which will be described below. The rules structure is a way to measure performance of a user via those data and a way to inform the process steps of a dynamic algorithm (see "Decision and marking algorithms" below) . It is well known in the art that progression is defined as the incremental increase in the acquisition of skills and knowledge and/or the ability to reason and provide solutions to problems through maturing cognitive or motor processes. Traditionally, such progressions have been determined by practitioners prejudging the performance of users at differing levels and/or using comparative methods on samples of user results. The dynamic assessment may use one or both of those methods to develop through the use of static, adaptive or dynamic algorithms. Constant monitoring of user standards will improve the validity of the assessment.

Rules 203: For a given subject or study area, a set of rules can be determined by one of the methods given above. The rules set out one or more of: a set of skills, an area of knowledge, a measure of understanding or a level of ability to provide solutions to problems.

In the case of a test which is dynamic, the rules need clarification, such clarification being a determination of the ways in which a user can demonstrate a required ability and/or show attainment of a particular rule. (There, of course, will be many different ways in which a user can demonstrate a required ability or particular attainment.)

A dynamic rule can be modified as a result of collective user performance, a modification to the curriculum (especially if that curriculum is, itself, dynamic) , shifts in technology or changing standards over time.

The curriculum 201 and the rules 203 are closely linked and each one alters as a result of changes in the other so that there is continuous feed back between them. In addition, the links themselves between the curriculum and the rules may change throughout an assessment or assessments.

Attainments 205 and Elaborations 207: An attainment 205 is a given level of achievement e.g. in the case where the user is a school pupil, it might be the ability to achieve a grade C in GCSE mathematics. Attainments are dynamic by virtue of being able to respond to individual solutions to a task or problem (i.e. there are an infinite number of ways in which a pupil can demonstrate the ability to achieve a grade C in

GCSE mathematics, each of which is the required attainment) . Each rule 203 has one or more attainments 205 contributing to it and each attainment 205 has one or more elaborations 207 contributing to it. An elaboration 207 is a detailed breakdown of how an attainment may be evidenced e.g. for the above case, an associated elaboration might be the ability to achieve a certain level in, say, geometry. It can be seen that there will be a number of such elaborations making up the attainment. In addition each elaboration may contribute to a number of different attainments. In addition, the relationship between attainments and elaborations may include Boolean operators e.g. for a given attainment elaboration X AND elaboration Y may be required or elaboration A NOT elaboration B may be required.

Elaborations 207 are more susceptible to change than attainments 205 as they are less generic and are more prone to alter as new methods of elaborating attainments are developed. Thus, elaborations for a given attainment are subject to change. Elaborations for a given attainment may be dynamic. In addition, for attainments that change, there will be new elaborations. Thus, the attainments and elaborations are closely linked and each one changes as a result of changes in the other so that there is continuous feedback between them.

Macro evidences 209: For each elaboration 207, there are many macro evidences 209. These macro evidences make up each elaboration and, in addition, each macro evidence will contribute to a number of different elaborations i.e. there is a many to many relationship between macro evidences and elaborations. For the example of an elaboration given above, an associated macro evidence might be a demonstration of understanding of, say, isosceles triangles. Different groups of macro evidences may contribute in two distinct ways to elaborations. Firstly, groups of the same macro evidence may indicate competence over chance i.e. a user is tested many times for a given elaboration, hence providing a group of macro evidences pointing to that elaboration and showing consistency. Secondly, groups of different macro evidences may indicate breadth of understanding.

Micro evidences 211: Micro evidences 211 are the lowest meaningful actions that a user makes in the assessment. In practice they may be where a piece of data or a discrete digital object, a file, a graphic or similar has something meaningful done to it by a user (e.g. the user has opened a file or amended a graphic) . For the above example, a micro evidence might be the selection of correct answer B in response to a multiple choice question testing the angles in isosceles triangles. The micro evidences 211 may also include auxiliary data e.g. time data (specifying the speed of processing) which may make a valid contribution to macro evidences 209 and then to an elaboration 207 e.g. one detailing efficiency. In the above example, a high level user may input correct answer B very quickly whereas a lower level user may take longer to input correct answer B. For each macro evidence, there are many micro evidences and for each micro evidence there are many macro evidences i.e. there is a many to many relationship between macro evidences and micro evidences .

We see from Figure 2 and from the above description of the terms, that the rules are the highest level indicators in the rules structure and the micro evidences are the lowest.

Figure 3 shows the evidence matrix 301, each macro evidence being represented schematically at 303. Referring to Figure 3, we see that macro evidences may be gained at a range of levels (1, 2....) across a range of strands (AA, AB... ZZ). The pathway through the evidence matrix may be via any number of routes and is not necessarily linear. The pathway through levels may be progressive i.e. coverage at one level may imply coverage or partial coverage at one or all lower levels or it may not be progressive i.e. coverage at a particular level can be achieved irrespective of coverage at other levels and does not necessarily imply coverage at lower levels. The pathway through levels may or may not be linear. Certain strands may be relevant to only certain levels.

The evidence matrix may relate to one or more domains or subject areas and there may be links to other evidence matrices relating to other domains or subject areas.

We see that each macro evidence 303 comprises a number of micro evidences 305 (shown schematically at 307) . Each micro evidence 305 in a given macro evidence 303 will also form part of another macro evidence . We also see that a number of macro evidences (those shown cross hatched) make up a given elaboration. The elaborations are dynamic in that the macro evidences comprising a given elaboration will vary for different users .

Referring once again to the previous example together with Figure 3, the micro evidences 305 are, say, answers to multiple choice questions relating to isosceles triangles . The overall macro evidence 303 is a demonstration of understanding of isosceles triangles. That macro evidence, along with many other (cross-hatched) macro evidences at a number of different levels (the actual number and type of macro evidences being dependent on the user) , contribute to the "geometry" elaboration. Clearly, an understanding of isosceles triangles may also be useful in, say, trigonometry so that macro evidence will also contribute to the "trigonometry" elaboration. Also, a given multiple choice question relating to isosceles triangles may also test an understanding of, say, similar triangles, in which case, that micro evidence will also be associated with the "similar triangles" macro evidence.

As already mentioned, the rules structure (comprising rules, attainments, elaborations, evidences and atomic indicators) is a way to measure user performance and thereby to assess user progression. That progression is defined as the incremental increase in the acquisition of skills and knowledge and/or the ability to reason and provide solutions to problems through maturing cognitive or motor processes. Thus, those maturing cognitive or motor processes of a given user can be demonstrated by the components of the rules structure. Because the rules structure ranges from high to low level indicators, those maturing cognitive or motor processes may be demonstrated at a number of levels.

• Decision and marking algorithms: The algorithms used in the assessment may be static or dynamic, in either a static or dynamic test. A static algorithm has pre-determined formula (set of steps) for calculating the user's performance and hence determining the test's next pathway. Static algorithms can themselves form part of a dynamic set of algorithms. A dynamic algorithm is one which is itself determined by the user's pathway through the assessment. A dynamic algorithm uses a set of rules which is, itself, dynamic i.e. dependent on the user's pathway through the assessment.

• Trigger points : Trigger points are the points in the assessment at which a different course of action is required.

The trigger points may be either static or dynamic. Static trigger points are at a pre-determined position in the assessment. The static trigger points occur in an assessment where it is expected that a user will have provided a given level of information, number of evidences or demonstration of skill. In an adaptive test, the trigger point informs the algorithm to make a calculation to affect the next pathway decision. Dynamic trigger points are the points at which the accumulated evidence determines that a particular course of action is required to elicit a further type of evidence. That course of action may be, for example, a change of difficulty level, a change of direction of task or the asking of a direct question in a task based best. Dynamic trigger points are determined by a profiling algorithm which makes judgements on performance based on the evidences gained. The dynamic trigger points are usually embedded at the elaborations level in the structure shown in Figure 2, in which case they are used to determine pathways and report. Alternatively they may be embedded at the rules level, to determine shifts in curriculum coverage or at the evidences level to influence tasks or question based decisions. The trigger points may be action, inaction or time driven.

Thus, the sufficiencies outlined below are preferably dynamic in that the number of opportunities to demonstrate a particular skill, knowledge or competence is dependent on the particular user' s pathway through the assessment and also in that the number of demonstrations required is also dependent on the user's pathway.

• Sufficiency: This is one of the elements in any marking algorithm used to establish whether a user has demonstrated that he has the required skill, knowledge or competence. In a dynamic system, the sufficiency is determined by the number of opportunities to provide evidence that a particular pathway through the assessment has provided. Since the pathway will be different for different users, the sufficiencies will also be different for different users. In a static system, the sufficiency thresholds can be dynamic in order to reflect overaJ performance as compared with the theoretical performance predicted by the test constructor.

Sufficiency of rules coverage: This sufficiency comprises a set of algorithms, either static or dynamic, which determine whether a user has attained a particular standard, level or grade (should one be required by the tester or test agent) or to determine the user's profile on a progression matrix. This sufficiency can be dynamically altered either during or post test, to suit the needs of the tester or test agent. It is likely (but not required) that the dynamic sufficiency would be used in conjunction with a dynamic curriculum. Sufficiency of attainments coverage: The sufficiency comprises a set of algorithms, either static or dynamic, that determine sufficient competence in specific measure of attainment, which in turn inform the sufficiencies algorithms employed by the rules coverage. These algorithms can be used dynamically during and post assessment, depending on a user's pathway, especially where a dynamic assessment is used across a dynamic curriculum. This sufficiency is preferably predetermined for the purposes of assessment creation, in which case it may be dynamically changed by actual performance. Alternatively, this sufficiency may be determined by the computer means implementing the assessment (see Figure 8), in the case of an assessment generated by the computer means itself .

Sufficiency of elaborations coverage: This sufficiency comprises a set of algorithms, either static or dynamic, that define whether a user has demonstrated competence in a contributing factor to an attainment and which in turn inform the sufficiencies algorithms employed by the attainments coverage. The sufficiency of elaborations determined at the point of test writing, is subject to dynamic change. As with attainments, this is likely to happen after the test, although when used in a dynamic test with a dynamic curriculum it is more likely that the algorithms for determining sufficiency of elaborations will need to be dynamic. The overall process allows for both. The sufficiency thresholds will be test specific.

Sufficiency of evidences coverage: This sufficiency comprises a set of algorithms, either static or dynamic, which define whether a user has demonstrated sufficient competence of evidences which in turn inform the sufficiencies algorithms employed by elaborations coverage. As the rules based marking model moves downwards through the structure (see Figure 2), definitions become more precise and therefore there are many more evidences than elaborations and many more elaborations than attainments. This means that it is most likely that dynamic algorithms will be used to calculate the evidence to elaboration relationships and thresholds.

For clarification, dynamic algorithms are algorithms in which one or more of the process steps in the algorithm is not pre- determined but is, in fact, dependent on the results obtained.

• Outputs: The outputs at the end of the dynamically adaptive assessment process can take the form of an adaptive learning environment, adaptive instruction both within and outside the system and/or a report.

Reporting: Reports can be summative, formative and/or progressive (see below) . Reports can be of any length and can focus upon individual competences, processes, skills or other factors or can comprise a number of different competences, processes or skills. If the assessment is being made over time, the report may comprise a number of partial reports, each giving a partial "credit" to the user. For example, the assessment may be to measure an increase in a particular skill over time, in which case each partial report will show a skill level at a given stage and the total report will show the increase in the skill level over time. An example of this type of assessment is a patient convalescing after an accident, the assessment being used to measure an improvement in, say, the motor skills of the patient. The assessment will show the general trend of the improvement, in the total report, and will also show the particular results in the partial reports which may be useful to indicate what type of factors are influencing the patient's improvement. Or, the assessment may be to measure different skills at different times (i.e. a modular-style course) in which case the partial reports can be saved and eventually contribute to a total report at the end of the assessment. An example of this type of assessment is a user undertaking a modular course of study e.g. a part time degree course, where the assessment may be taking place over several years via a number of different modules, which may be cross curricular or cross psychometric and/or motor metric domains.

Summative reports give a simple user profile at the end of a learning/testing stage. A summative report will aim to draw up a "level" or a grade, or a point in a progression matrix or profile for the user.

Formative reports aim to guide the user and/or influence future learning models. A formative report may indicate a "level" or a grade, or a point in a progression matrix or profile, but will also aim to aid future progress or personal development or remedial action.

Progressive reports may give the user information regarding the best pathway for maximised future learning, personal development, psychometric or motor metric improvement or remedial action based on the assessment.

The dynamically adaptive assessment (comprising the separate "Testing" and "Marking & Output" steps) is a dynamic assessment. As already mentioned, this is distinct from an adaptive assessment (which has a number of pre-determined, definable pathways through it) by virtue of having any number of possible pathways through the assessment (which correspond to the number of possible users of the assessment) .

The terms which have been described above (attainments, elaborations, trigger points etc) each fit into the assessment system, either as part of the testing step or as part of the marking and output step. Each of those terms may itself be static or dynamic. A static term is one which is entirely pre-determined. A dynamic term is one which is partially or wholly dependent on the user or users or user action. For example, if the elaborations are static, there are a given number and group of elaborations to demonstrate a particular attainment, irrespective of the particular user and the elaborations demonstrated to date. Alternatively, if the elaborations are dynamic, the number and group of elaborations required for a given attainment is not predetermined but is dependent on the user i.e. the elaborations/evidences demonstrated so far, the psychometric profiling data etc.

The assessment may cover all or part of the user's curriculum at one or more different levels. The actual pathway through the dynamic assessment is determined by the evidences shown as the user answers the questions or completes the tasks and the pathway is also affected by the purpose of the assessment, i.e. the skills, knowledge, understanding and/or applications to be assessed which in turn affect the algorithms embedded into the assessment.

The elements of the Testing step which contribute to the dynamic character of the assessment are the breadth of the curriculum or curricula covered, the structure of the test and the questions or tasks themselves. The breadth of the curriculum or curricula covered: Depending on the dynamic nature of the curriculum, as described previously, the test can, according to the outcomes and inferences made, be self determining as to whether or not it covers a broad spread or a narrow focus of the curriculum or whether it includes cross curricula performance for example maths as part of a physics assessment, English as part of a history assessment or more skills as part of an ICT (Information and Communications Technology) assessment. An example of this may be that a user has evidenced a particular level of performance which precludes them being awarded a higher level, but it may wish to pursue areas of strength and/or weakness in the lower level for formative and/or progression reporting.

The structure of the test: In terms of test structure, a question-based test may dynamically alter the level of difficulty of the question within a level to refine its ranking decision and reporting. In a task or process based test, the decision may be to ask a direct question to elicit evidence .

The questions or tasks themselves: The questions and/or tasks which form the test have in themselves dynamic elements: content e.g. data, information, graphics, digital assets; context e.g. the setting within which a question or task resides; the structure of the task or question e.g. how and why it has been written; and the language used in the question or task.

Content: The content of a question or task may be dynamically changed. For example, a list of alternative proper nouns or names may be substituted where a place-holder is embedded in the question or task. Alternatively, where data is collected from a data source (such as a weather station) during the test, data may be called and used via dynamic links or randomly generated. This form of dynamic change will need to be managed through a set of parameters or a fixed alternative data set or data sets to ensure that what is being tested is what was intended to be tested. The marking algorithms may be dynamically changed to accommodate any dynamic change in content.

Context: The context of a question or task may be dynamically or statically changed to make the test relevant to the user. Dynamic change may be by altering the context whilst the test is in progress, whereas static change may be by a choice of fixed contexts at the start of the test.

Structure of the task or question: Dynamically changing the structure of a question or task may be performed when a user shows signs of misunderstanding or misinterpretation the intention of a question or task. For example, it may be that the test marking algorithms detect inappropriate responses and thereby make inferences that the user may require a different structure to the question or task to suit that particular learning style or cognitive profile. For example, the particular learning style or cognitive profile may determine the mode of delivery e.g. audio instead of text questions. The user's psychometric profiling data may also aid the algorithms in this respect. The dynamic test may, as a result, present the question or task graphically rather than textually, or aurally through the use of a text to speech synthesiser. Alternative methods of dynamically altering the test structure may be to simplify the grammatical elements of the question, task resource, information. A dynamically changing structure may be used if the psychometric profiling data showed signs of special needs, e.g. dyslexia. The marking algorithms may be dynamically changed to accommodate any dynamic change in structure .

Language: Dynamically changing the complexity of the language employed in the test by, for example, applying a fog index and utilising a thesaurus, whilst maintaining the integrity of the question or task where the psychometric profiling data or user performance showed signs of special needs.

The element of the Marking & Output step which contributes to the dynamic character of the assessment is the use of complex marking arrays . In their dynamic form, the marking arrays are unique to each user e.g. the route through the marking array is multi dimensional and depends on the particular user rather than being pre-determined. In their static form, the marking arrays are reduced to a ^λΛone-dimensional" marking scheme where the route through the marking scheme is a pre- determined one-dimensional line.

The overall process is implemented on a computer based system. The system comprises host computer means (which has access to the tasks, questions, algorithms etc) and one or more user computer means which output questions/tasks (in a number of different ways), receive question/task responses (in a number of different ways) and may also output outputs and/or diagnostic outcomes (in a number of different ways) .

Figure 4 shows one method of test construction. The computer- based system holds existing data, tasks and/or questions in a searchable database together with any auxiliary information (e.g. user response speed). Each piece of data in the database is linked to many other pieces of data (many to many relationship) in order to provide the test developer with links for the test developer and/or host computer means.

As shown in Figure 4, at the first stage, the assessment developer requests data from the database, step 401. The assessment developer may be human or computer-based. The data search engine then searches the database for pieces of data. Step 403. Those data may be questions or tasks 405, assessment items 407, marking schemes 409, user responses 411 or auxiliary data 412. Assessment items 407 may include a piece of data, a discrete digital object, a file, a graphic, a web page, a hyperlink, a partially completed equation etc. Auxiliary data 412 may include e.g. time data relating to user responses. There may be SEN (Special Educational Needs) or other checks 413 (including motor process checks) of one or all sets of those data. Also note the feedback loops between the data sets. The new assessments are developed at step 421 via completely new test data 415, modified existing test data 417 and unmodified existing test data 419. By using such a database, efficiency of test development is improved as well as the consistency and accuracy of the tests themselves .

SEN (Special Educational Needs) is a well-known educational term and comprises use of e.g. enlarged text, speech to text synthesisers, extended keyboards etc.

Figure 5 shows the development of trigger points and marking algorithms. The computer-based system holds marking algorithms and trigger points already known in static and adaptive testing. At the first stage, the assessment developer (either human or computer-based) works on the initial ideas, step 501. Those ideas are inputted to modelling means 503, which models the initial ideas, receiving inputs from the database of marking algorithms 505. Potential models are outputted at 507 and fed back to the assessment developer or computer means to modify the assessment developer's ideas and ensure sufficiency. The models are also inputted into the database 505 for future use. The feedback loop of Figure 5 uses existing models combined with concepts of assessment developers to improve assessment development.

Once the assessment has been constructed in accordance with Figures 4 and 5, the actual assessment process can proceed. Referring again to Figure 1, we see that the dynamically adaptive assessment involves two steps: Testing and Marking & Output .

The Testing process is shown schematically in Figure 6. The data from the psychometric profiling step 601 are inputted into a process 603 which determines an appropriate test entry point using static, adaptive and/or dynamic logic and/or algorithms. It should be noted from Figure 1 that the psychometric profiling data itself comprises diagnostic outcomes and outputs fed back from previous assessments. Once a test entry point has been determined, a task or question is selected at step 605 from a database 607 of questions and tasks. After the user's response is received, the process repeats again i.e. further questions and/or tasks are selected and further user responses are received. The user's response or responses received at step 609 together with the psychometric profiling data 601 will determine an appropriate next task or question. After an appropriate number of repeats (determined by the user's responses and the psychometric profiling data) , the test data 611 is ready to input to the Marking & Output process. The Marking & Output process is shown schematically in Figure 7. The data 701 from the testing and the data 703 from the psychometric profiling step are inputted into a marking process 705 which uses static, adaptive and/or dynamic logic and/or algorithms to generate outputs 707. Those outputs 707 may, in turn, be fed back into the psychometric profiling data 703 if appropriate.

The dynamically adaptive assessment can be implemented on the system illustrated in Figure 8. The system 801 comprises host computer means 803 and one or more user computer means 805, 805', 805''.... The host computer means 803 has access to a database of questions and tasks 807, a database of algorithms for test entry point selection 809 and a database of marking algorithms 811. The host computer means 803 has inputs from the user computer means 805 and from any other data e.g. the psychometric profiling data. The host computer means 803 has output (s) to the user computer means 805. The user computer means has output to a user (for putting questions and tasks to a user) and input from a user (for user responses) .

The host computer means 803 uses an inferential engine (not shown) for its process steps. The inferential engine allows the host computer means 803 to make inferences from the available data to generate the next process step. Referring to Figure 6, the inferential engine is used by the host computer means 803 to deduce a suitable test entry point based on the psychometric profiling data (step 603) and also to select a question or task based on user response, which in turn depends on the previous question or task and the psychometric profiling data (step 605) . Referring to Figure 7, the inferential engine is used by the host computer means 803 to generate an output 707 via a marking process 705 which is based on captured test data 701 and psychometric profiling data 703. Marking can be in the form of logic decisions based on outputs from the inferential engine. The inferential engine also allows the computer means to make inferences from the available data in other areas e.g. assessment development, generation of marking algorithms.

An inference engine may for example be implemented: context to context, within a subject domain; subject domain to subject domain (e.g. Physics to Maths or History to English); non-curriculum to curriculum (e.g. verbal reasoning to English); award to award or unit to unit (e.g. NVQ metalworking to GNVQ Plumbing) .

Previous data (e.g. subject-specific related outcomes and curriculum coverage linked to non-subject domain information, including data from outcomes of other subject domains as well as psychometric and motormetric data) may determine the start point in terms of, for example, area of curriculum or the method of delivery (e.g. an audio version for blind people) .

For example, a person may be about to be assessed in a particular subject domain (say Mathematics) from his or her previous results in other subject domains (e.g. Physics), where underlying mathematical concepts are implicitly assessed. The resultant profile could determine the level of difficulty start point for the Maths assessment, based on the concepts already assessed in Physics, or indeed the curriculum coverage specifically required by the Maths test to award a grade or outcome.

Thus a first user may be delivered a test that starts with quadratic equations, determined by a conceptual profile from previous Maths tests and Physics/Chemistry. A second user may be delivered a test starting with a higher mathematical concept, in the same area of the curriculum - say complex quadratics, based on the second user's performance profile.

This approach reduces the length of the assessment whilst maintaining sufficient evidence.

Equally an inference engine may look at a non-related subject domain and, through the use of statistical inference, determine that people who achieve a specific level or grade in one subject domain usually perform at a certain level in another. (This can also apply to statistical inferences from non-subject domains eg Verbal Reasoning or Spatial awareness assessments . )

Thirdly, an inference engine may use dynamic statistical information to predict relationships between concepts within a curriculum (static or dynamic) and direct tasks or questions to test that prediction; if sufficient inferences (or hypotheses) are established, the remainder can be assumed also to be upheld.

Again this reduces the assessment length whilst maintaining sufficient evidence for assessment validity.

The marking process should be moderated to facilitate test development and standardisation, and this process is shown schematically in Figure 9. In this embodiment, the marking and moderation are "human" but are supported by a computer- based system. Alternatively, the marking and moderation may be carried out by the host computer means e.g. using

Artificial Intelligence or Neural Network engines. The test data 901 captured by the system are transferred to human marking 903 and human moderation 905. The human marking and moderation may be carried out in a number of ways: on screen marking and moderation i.e. directly into the computer based system; off screen marking followed by data entry (either manual or automatic) into the computer-based system; or onscreen marking combined with other data to assist in the marking processes i.e. somewhere between marking directly onto the computer-based system and marking off computer-based system with a separate data entry step. It is possible for the marking and/or moderation steps to be entirely automatic (i.e. without human input) using a computer-based system comprising Artificial Intelligence or Neural Network engines - this is not shown specifically in Figure 9. The results of the marking 903 and/or moderation 905 steps are fed into a comparator 907 which checks quality and consistency. The outputs from the comparator 907 are fed back into the marking 903 and/or moderation 905 steps to provide a feedback loop improving consistency.

The outputs from the comparator 907 are also fed (either automatically or manually) into an awarding system 909 which may be arranged to provide outputs in a number of ways e.g. comparison charts, improved marking schemes, particular examiner data.

Figure 10 shows a system which may be used to connect the marking process outlined in Figure 7 to an award of levels or grades. All tests (whether static, adaptive or dynamic) rely on a marking scheme and a calculation to convert user responses into some sort of level, award or grade. The process used in this embodiment to provide an assessment of levels is shown in Figure 10 and this process improves the validity, reliability and efficiency of the marking process, giving more precision to the awarding process and more control to the examining experts (or computer-based system comprising Artificial Intelligence or Neural Network engines) . The process consists of six stages, the outputs from each stage being fed back into the previous stage to ensure the highest possible accuracy for future tests.

At the first stage 1001, the outputs of the users are confirmed (e.g. confirming a particular percentage previously established for each user) . At the second stage 1003, material is obtained for setting preliminary levels for the user outputs (e.g. categorising the user results into groups e.g. grade A, B...). At the third stage 1005, preliminary level boundaries are set using static, adaptive and/or dynamic algorithms (e.g. grade A/B boundary set at 75%) . At the fourth stage 1007, the level boundaries established at the third stage are confirmed by one or more examining experts (or computer-based system comprising Artificial Intelligence or Neural Network engines) . At the fifth stage 1009, the actual user outputs are reviewed. This is likely to be review of user outputs falling just below or above a level boundary and/or user outputs which are unexpected given previous evidence (e.g. the results of a user achieving say 74% may be reviewed) . At the sixth stage 1011, the level boundaries (and hence the level awarded to each user) are finally confirmed.

It can be seen that the results at each stage 1001 to 1011 are fed into a data set 1013 and then into a statistical output 1015 in the form of onscreen data. That onscreen data can be used throughout the process to keep track of the results and assist the decision making at each stage.

Diagnostic Outcomes: The third step in the overall process of diagnostic outcomes is a developed form of the output stage of the dynamically adaptive assessment. The outputs at the end of the dynamically adaptive assessment, as described previously, may indicate potential areas of special needs, conceptual weakness, cognitive or motor weakness that need to be investigated. This is particularly useful when compared to the results from the first step of psychometric or motor metric profiling.

The diagnostic outcomes are similar to those outputs but may also include specific reporting requirements for referral e.g. to specialist practitioners.

The diagnostic outcomes may include a grade or level or a point in a progression matrix, a report, a user profile, a proposed user work plan.

The diagnostic outcomes may include material or links to materials. Such materials could be used by the user to facilitate future learning and/or to meet particular needs . The diagnostic outcomes may propose further assessment.

The diagnostic outcomes can be fed back into the psychometric profiling stage in order to improve the results in future assessments .

The diagnostic outcomes may be directed to a user, an assessment developer, an assessment agency or another interested party.

Claims

1. A method of assessing a user comprising the steps of: a. computer means processing existing user data to determine a test starting point; b. computer means selecting at least one question or task from a database of questions and/or tasks; c. computer means putting the selected question (s) and/or task(s) to the user; d. computer means receiving a response from the user; e. computer means processing the user's response to generate a user result; f. repeating steps b, c, d and e until the computer means can generate a user outcome, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and g. computer means reporting the user outcome.

2. A method as claimed in claim 1, further comprising computer means processing the existing user data as well as the user's response to generate the user result.

3. A method as claimed in claim 1 or claim 2, in which the question (s) and/or task(s) selected at each selecting step are dependent on the previous response or responses of the user.

4. A method according to any preceding claim wherein the existing user data comprises historical data.

5. A method according to any preceding claim wherein the existing user data comprises existing ability data.

6. A method according to any one of claims 1 to 5 wherein the existing user data comprises attainment data.

7. A method according to any one of claims 1 to 6 wherein the existing user data comprises a user outcome or outcomes from previous assessments.

8. A method according to any one of claims 1 to 7 wherein the computer means determines the test starting point via at least one algorithm from a database of algorithms.

9. A method according to any one of claims 1 to 8 wherein the computer means selects the question or task from the database of questions and/or tasks via at least one algorithm from a database of algorithms .

10. A method according to claim 8 or claim 9 wherein the database of algorithms comprises at least some algorithms whose steps are pre-determined.

11. A method according to any one of claim 8 to 10 wherein the database of algorithms comprises at least some algorithms whose steps are dependent on the existing user data and/or the user result or results.

12. A method according to any one of claims 8 to 11 wherein the database of algorithms comprises at least some algorithms having at least some steps which are pre-determined and at least some steps which are dependent on the existing user data and/or the user result or results.

13. A method according to any one of claims 8 to 12 wherein the database of algorithms is stored on the computer means.

14. A method according to any one of claims 8 to 12 wherein the database of algorithms is stored on a record carrier for use with the computer means .

15. A method according to any one of claims 1 to 14 wherein the user result is generated via at least one marking algorithm from a database of marking algorithms .

16. A method according to any one of claims 1 to 15 wherein the user outcome is generated via at least one marking algorithm from a database of marking algorithms.

17. A method according to claim 15 or claim 16 wherein the database of marking algorithms comprises at least some marking algorithms whose steps are pre-determined.

18. A method according to any one of claims 15 to 17 wherein the database of marking algorithms comprises at least some marking algorithms whose steps are dependent on the existing user data and/or the user's response and/or the user result or results .

19. A method according to any one of claims 15 to 18 wherein the database of marking algorithms comprises at least some marking algorithms having at least some steps which are predetermined and at least some steps which are dependent on the existing user data and/or the user's response and/or the user result or results.

20. A method according to any one of claims 15 to 19 wherein the database of marking algorithms is stored on the computer means .

21. A method according to any one of claims 15 to 19 wherein the database of marking algorithms is stored on a record carrier for use with the computer means.

22. A method according to any one of claims 15 to 21 wherein the marking algorithms are periodically moderated.

23. A method according to any one of claims 1 to 22 wherein the database of questions and tasks includes questions and tasks with varying content, context, structure and language.

24. A method according to any one of claims 1 to 23 wherein the database of questions and tasks includes questions and tasks covering one or more subject areas Or psychometric or motor metric domains.

25. A method according to any one of claims 1 to 24 wherein the database of questions and tasks includes questions and tasks at a plurality of difficulty levels.

26. A method according to any one of claims 1 to 25 wherein the questions and tasks in the database of questions and tasks include links to other questions and/or tasks in the database to facilitate question and task selection.

27. A method according to any one of claims 1 to 26 wherein the database of questions and/or tasks is stored on the computer means .

28. A method according to any one of claims 1 to 26 wherein the database of questions and/or tasks is stored on a record carrier for use with the computer means.

29. A method according to any one of claims 1 to 28 wherein step d of computer means receiving the user' s response further comprises the computer means receiving auxiliary information.

30. A method according to claim 29 wherein the auxiliary information comprises the time taken for the user's response.

31. A method according to any one of claims 1 to 30 wherein the user outcome reported by the computer means comprises a summative report .

32. A method according to any one of claims 1 to 31 wherein the user outcome reported by the computer means comprises a formative report .

33. A method according to any one of claims 1 to 32 wherein the user outcome reported by the computer means comprises a progressive report.

34. Host computer means programmed to be operable to carry out a method according to any one of claims 1 to 29.

35. User computer means programmed to be operable to carry out a method according to any one of claims 1 to 29.

36. Computer means comprising host computer means according to claim 34 and user computer means according to claim 35.

37. A computer program including code portions which, when executed by computer means cause that computer means to carry out the steps of a method according to any one of claims 1 to 33.

38. A record carrier having recorded thereon information indicative of a computer program according to claim 37.

39. A signal carrying information indicative of a computer program according to claim 37.

40. A method of constructing a user assessment database comprising the steps of: a. computer means selecting assessment data from a first database of assessment data, the assessment data selected being appropriate to the particular user assessment being constructed; b. computer means modifying at least some of the selected assessment data; and c. computer means inputting the selected assessment data, modified and unmodified, into a second database of assessment data .

41. A method according to claim 40 wherein the assessment data in the first database of assessment data have been at least partially generated by the computer means.

42. A method according to claim 40 or claim 41 wherein the assessment data in the first database of assessment data have been at least partially generated by a human assessment developer.

43. A method according to any one of claims 40 to 42 wherein the assessment data comprises questions and/or tasks.

44. A method according to any one of claims 40 to 43 wherein the assessment data comprises user responses.

45. A method according to any one of claims 40 to 44 wherein the assessment data comprises marking schemes.

46. A method according to any one of claims 40 to 45 wherein the assessment data comprises assessment items.

47. A method according to any one of claims 40 to 46 wherein the assessment data comprises auxiliary data.

48. A method according to any one of claims 40 to 47 further comprising step d. of computer means generating additional assessment data and inputting the generated assessment data into the second database of assessment data.

49. A method according to any one of claims 40 to 48 wherein the first database of assessment data is stored on the computer means .

50. A method according to any one of claims 40 to 48 wherein the first database of assessment data is stored on a record carrier for use with the computer means .

51. A method according to any one of claims 40 to 50 wherein the second database of assessment data is stored on the computer means .

52. A method according to any one of claims 40 to 50 wherein the second database of assessment data is stored on a record carrier for use with the computer means .

53. A method of assessing a user according to any one of claims 1 to 39 wherein the database of questions and/or tasks is constructed by a method according to any one of claims 40 to 52.

54. A method of assessing a user comprising the steps of: a. processing existing user data to determine a test starting point;

b. selecting at least one question or task from a database of questions ^" and/or tasks; c. putting the selected question (s) and/or task(s) to the user; d. receiving the user's response; e. processing the user's response and, optionally, the existing user data to generate a user result; f. repeating steps b, c, d and e until an appropriate user outcome can be generated, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and g. reporting the user outcome.

55. A method according to claim 54 wherein the existing user data comprises historical data.

56. A method according to claim 54 or claim 55 wherein the existing user data comprises existing ability data.

57. A method according to any one of claims 54 to 56 wherein the existing user data comprises attainment data.

58. A method according to any one of claims 54 to 57 wherein the existing user data comprises an outcome or outcomes from previous assessments.

59. A method according to any one of claims 54 to 58 wherein the test starting point is determined via at least one algorithm from a database of algorithms.

60. A method according to any one of claims 54 to 59 wherein the question or task is selected from the database of questions and/or tasks via at least one algorithm from a database of algorithms.

61. A method according to claim 59 or claim 60 wherein the database of algorithms comprises at least some algorithms whose steps are pre-determined.

62. A method according to any one of claims 59 to 61 wherein the database of algorithms comprises at least some algorithms whose steps are dependent on the existing user data and/or the user result or results.

63. A method according to any one of claims 59 to 62 wherein the database of algorithms comprises at least some algorithms having at least some steps which are pre-determined and at least some steps which are dependent on the existing user data and/or the user result or results.

64. A method according to any one of claims 54 to 63 wherein the user result is generated via at least one marking algorithm from a database of marking algorithms .

65. A method according to any one of claims 54 to 64 wherein the user outcome is generated via at least one marking algorithm from a database of marking algorithms .

66. A method according to claim 64 or claim 65 wherein the database of marking algorithms comprises at least some marking algorithms whose steps are pre-determined.

67. A method according to any one of claims 64 to 66 wherein the database of marking algorithms comprises at least some marking algorithms whose steps are dependent on the existing user data and/or the user's response and/or the user result or results .

68. A method according to any one of claims 64 to 67 wherein the database of marking algorithms comprises at least some marking algorithms having at least some steps which are predetermined and at least some steps which are dependent on the existing user data and/or the user' s response and/or the user result or results.

69. A method according to any one of claims 64 to 68 wherein the marking algorithms are periodically moderated.

70. A method according to any one of claims 54 to 69 wherein the database of questions and tasks includes questions and tasks with varying content, context, structure and language.

71. A method according to any one of claims 54 to 70 wherein the database of questions and tasks includes questions and tasks covering one or more subject areas or psychometric or motor metric domains.

72. A method according to any one of claims 54 to 71 wherein the database of questions and tasks includes questions and tasks at a plurality of difficulty levels.

73. A method according to any one of claims 54 to 72 wherein the questions and tasks in the database of questions and tasks include links to other questions and/or tasks in the database to facilitate question and task selection.

74. A method according to any one of claims 54 to 73 wherein step d of receiving the user' s response further comprises receiving auxiliary information.

75. A method according to claim 74 wherein the auxiliary information comprises the time taken for the user's response.

76. A method according to any one of claims 54 to 75 wherein the user outcome comprises a summative report.

77. A method according to any one of claims 54 to 76 wherein the user outcome comprises a formative report.

78. A method according to any one of claims 54 to 77 wherein the user outcome comprises a progressive report.

79. A method of assessing a user comprising the steps of:

a. host computer means processing any appropriate existing user data to determine a test starting point and a test delivery method; b. host computer means selecting at least one question and/or task from a database of questions and/or tasks; c. user computer means putting the selected question (s) and/or task(s) to the user; d. user computer means receiving the user's response. e. host computer means processing the user's response and, optionally, the existing user data, to generate a user result; f. repeating steps b, c, d and e until the host computer means can generate a user outcome, the question (s) and/or task(s) selected at each selecting step being dependent on the previous user result or results; and g. user computer means reporting the user outcome.