WO2010134885A1

WO2010134885A1 - Predicting the correctness of eyewitness' statements with semantic evaluation method (sem)

Info

Publication number: WO2010134885A1
Application number: PCT/SE2010/050548
Authority: WO
Inventors: Farhan Sarwar; Sverker SIKSTRÖM
Original assignee: Farhan Sarwar; Sikstroem Sverker
Priority date: 2009-05-20
Filing date: 2010-05-20
Publication date: 2010-11-25

Abstract

A method to predict the correctness of eyewitness' statements. The method comprises: collecting a text corpus comprising a set of words; generating a representation of the text corpus; creating a semantic space for the set of words, summarizing statements in the semantic space; training on a set of set statements where the correctness is known to identify a prediction model, and applying the model to new statements for predicting correctness. A computer program and a computer-readable means containing said computer program are also included.

Description

Predicting the Correctness of Eyewitness' Statements with Semantic Evaluation Method (SEM)

Introduction

Eyewitnesses are the key actors in crime situations and frequently the only source of information for the investigators, lawyers and courts. Although blood, DNA, and other analysis do provide valuable information about a crime, eyewitnesses' testimonies still have a significant role in determining the nature of crime and fixing somebody with responsibility. Accordingly eyewitnesses are expected to provide credible information about a crime. However, eyewitnesses often fail to provide the required information. For example, relying on DNA evidence the well known "Innocent project" has exonerated 252 people who were on death row. In 74% of these cases conviction was based on eyewitness misidentification and in 16% of these cases an informant testified against the defendant, leading to the convictions on the basis of incorrect eyewitness testimony (Innocent project, n.d.). Research evidence shows that eyewitnesses' memories are generally very fragile and there are number of distortion factors. Such factors are, for example, simple forgetting, discussions among co-witnesses (Shaw-III, Garven, & Wood, 1997), eyewitness's exposure to the media coverage of the witnessed event (Loftus & Hoffman, 1989), questions asked by investigators, lawyers, and healthcare personnel and source attribution errors of the eyewitnesses. How to evaluate the eyewitness memory is another big issue. No other person but the eyewitness has direct experience of the event occurring at the crime scene and no other criterion is available for evaluating the eyewitness testimony . There are some indirect methods to test the validity of eyewitness statements. For example, cognitive interview (CI, Allwood, Ask, & Granhag, 2005; Fisher & Schreiber, 2007), criteria based content analysis (CBCA, Kulkofsky, 2008; vrij, 2003), reality monitoring (RM, Johnson & Raye 1981) technique, etc. In spite of these methods the people in criminal justice system mostly use eyewitness' confidence as a yardstick to evaluate the credibility of eyewitness statements (Brewer & Burke, 2002; Brewer, Potter, Fisher, Bond, & Luszcz, 1999; Juslin, Olsson, & Winman, 1996) Confident eyewitnesses are considered more credible then those who are not that confidence and vice versa. However, the relationship between confidence and accuracy is not constant across situations. An eyewitness' confidence can be influenced by social factors that are independent from perceptual and memorial processes (Luus & Wells, 1994). For example, multiple retellings of an event can simply increase the confidence while memory accuracy does not increase simply because of reiteration effect (Hertwig, Gigerenzer, & Hoffrage, 1997). Consequently, researchers have their reservations in using confidence as a barometer to measure accuracy in eyewitness statements (e.g. Brewer & Burke, 2002). The methods discussed above to evaluate the accuracy of eyewitness testimony are subjective, and have limitations in applied context. That's why the use of these methods to evaluate the credibility of eyewitness testimony is criticized in the research literature. Here we suggest a new statistical method, Semantic Evaluation Method (SEM), to evaluate accuracy of eyewitness statements. This method is objective, reproducible, and ecologically valid as compare to the existing methods.

Theory behind the method

SEM is inspired by the theory of latent semantic analysis (LSA) (Landauer & Dumais, 1997; Landauer, McNamara, Dennis, & Kintsch, 2007). According to this theory humans posses knowledge and they express their knowledge through language (or words). How the words occur in text with relation to each other basically determine the meaning and communicate the knowledge. Based on experimental research we have found support that SEM significantly distinguishes between correct and incorrect statements.

Summary of the Invention

More particularly, the object of SEM is to provide a computer implemented method and a system that allows for prediction of the correctness of a statement. The method is performed on at least one computer and comprising the steps of: collecting a text corpus comprising a set of words; generating a representation of the text corpus; creating a semantic space for the set of words, summarizing statements in the semantic space, training on a set of training statements where the correctness is known to identify a prediction model, applying the model to statements for predicting correctness.

Here, a "text corpus" is a large and structured set of texts which is typically electronically stored and which may be electronically processed. The text corpus may contain texts in a single language or text data in multiple languages, and is collected by using conventional, known methods and systems.

A "semantic space" is the result of a mathematical algorithm that takes a text corpus as an input and creates a high dimensional space, where the dimensions in the space corresponds to semantic qualities, or features in of the words in the corpus. For example, one dimension may represent whether the words relate to something that is alive, whereas another dimension may represent to what extent the word relates to an emotion. Synonyms are located nearby each other in the space, and the distance between words is a measure of how semantically close the words are. The distance between two words is typically measured by the cosines of the angel between vectors representing the words, although other distant measures may also be used. Semantic spaces are created by using information of co-occurrence, and examples of algorithms for creating semantic spaces include the known Latent Semantic Analysis (LSA) (Landauer & Dumais, 1997; Landauer, et al, 2007), Independent Component Analysis and the random indexing (RI) method (Sahlgren, 2007).

A location in the semantic space is a point in the semantic space, which represents e.g. a word, but may also represent several words or even set(s) of keywords. A "semantic dimension" is any judgment relating to the meaning (semantic) of a word (concept), such as positive or negative evaluations, trustworthiness, innovations, intelligence, etc.

By statements we mean any form of set of words generated by humans. Statements can be generated in writing or orally. Statements can be correct, incorrect, or partly correct. Comparing the statement with an external criterion may validate the correctness of the statement.

In a preferred embodiment of the computer-implemented method said text corpus is an electronically structured and processed set of texts, and said text corpus contains texts in a single language which is the same language as is used for said new statement whose correctness is to be predicted.

In a preferred embodiment of said method method, said semantic space is calculated using an algorithm selected from the group of LSA or RI. Preferably, the semantic space is then compressed using the algorithm SVD.

In a preferred embodiment of said method, the frequency is normalized by taking a logarithm of the frequency when creating a semantic space using the LSA algorithm. In a preferred embodiment of said method, said text corpus comprise more than ten times, preferable more than fifty times as many words as said statement whose correctness is to be predicted.

According to another aspect of the invention, a system for predicting a value of a variable associated with a target word is described. The system comprises at least one computer and is configured to: collect a text corpus comprising a set of words; generate a representation of the text corpus; create a semantic space for the set of words, based on the representation of the text corpus; define, for a location in the semantic space; based on the semantic space and the defined variable value of the location in the semantic space; and calculate a predicted value of the target word, on basis of the semantic space, the defined variable value of the location in the semantic space and the estimated variable value of the target word.

According to yet another aspect of the invention a computer readable medium is provided, having stored thereon a computer program having software instructions which when run on a computer cause the computer to perform the steps of: collecting a text corpus comprising a set of words that include the target word; generating a representation of the text corpus; creating a semantic space for the set of words, based on the representation of the text corpus; defining, a location in the semantic space, a value of the variable; estimating, for the target word, a value of the variable, based on the semantic space and the defined variable value of the location in the semantic space; and calculating a predicted value of the target word, on basis of the semantic space, the defined variable value of the location in the semantic space and the estimated variable value of the target word. The inventive system and computer readable medium may, as described, comprise, be configured to execute and/or having stored software instructions for performing any of the features described above in association with the inventive method, and has the corresponding advantages.

This invention predicts the degree of correctness of statements. The predictions are made by first converting the words in the statement to a representation in the semantic space. The relation between how correct a statement is and the semantic representation is identified by studying known examples. This relation can then be used to predict correctness of new statements. The invention has been validated in experimental studies. The invention can be divided into the following steps:

(1) Creating a semantic space

The creation of the semantic space requires a huge collection of text called corpus. This corpus needs to be in the same language as the eyewitness data that is going to be analyzed, and it has to be large. In addition it is preferred, but not necessary, that the general semantic topic of the corpus vaguely relates to the eyewitness statements. However, it is more important that the corpus is large, than that there is a resemblance to the eyewitness text. The text corpus can, for example, be collected by conventional, automatic search robots that scan the internet, text or news databases, databases of spoken language, electronic sources or other collections of text.

Next a semantic space is created from the text corpus, for example by using Latent Semantic Analysis (LSA), Independent Component Analysis (ICA) or random indexing (RI). Other equivalent algorithms that may transform words to distributed semantic representations may also be used. In brief, LSA first creates a table including words (rows) and local context (columns), where each table entry counts the frequency of the words in the local text context. Semantic spaces are created by the known data compression algorithm called singular value decomposition (SVD) (Golub & Kahan, 1965) that reduces the large number of contexts to a moderate number of semantic dimensions. The quality of the semantic space can be measured by testing the semantic space on synonyms tests. In this invention the algorithm, the parameters settings and the distance measure that yields the best performance on such test are preferred. The result of such an analysis (e.g. the parameter for the number of dimension used, etc.) depends on the data corpus that is used, and may therefore vary for different applications. The skilled person knows how to select appropriate algorithms and parameter settings. He may, for instance, use the information in the references cited in this application.

(2) Representing a statement as a location in the semantic space

First training statements are collected. These statements have a known truth value i.e. are either correct or incorrect and are representative of the statements that it is desirable to predict correctness of. For example, if the purpose is to predict statements made from eyewitness testimony, then a number of eyewitness training statements are collected where it is known whether they are correct or false. Secondly, statements that it indeed is desirable to predict the correctness of are also collected.

These statements are summarized in the semantic space. This is done by identifying the location in the semantic space associated with each word in the sentence. The statement is summarized as the mean location of the words in the sentence.

(3) Training the relation between the semantic space and correctness of statements

A model for the relation between the training statements and the correctness of the word is built. This is conducted by known, suitable mathematical multidimensional optimization techniques, for example by using multiple linear regression where the dimensions in the semantic space is used as repressor for the correctness of the statement (Cohen, Cohen, West, & Aiken, 2003). However, other techniques for predicting the relation between the semantic space and an external variable may also be used; for example classifier such as support vector machine, etc (Meyer, Leisch, & Hornik, 2003). The predictor that produces the highest logistic correlation between predicted accuracy and true accuracy of the statements is selected.

Multiple linear regression is a known form of regression analysis in which the relationship between one or more independent variables and another variable, called dependent variable, is modeled by a least squares function, called linear regression equation. This function is a linear combination of one or more model parameters, called regression coefficients. A linear regression equation with one independent variable represents a straight line, and the results are subject to statistical analysis. In this context, conventional multiple linear regression is used.

(4) Predicting correctness of statements

Following training, the model can be applied to new data. This is done by summarizing the to- be predicted statements in the semantic space and conducting the multiple linear regression. The result is a predicted probability of correctness of the statement. A simple example of summarizing such statements and conducting multiple linear regression can be found in the following example. Example:

For providing an example with numerical values, the following corpus is considered as the text that the semantic space is created on. These corpus needs to be huge (megabytes or larger), however, for practical purpose here we consider a small toy example:

document 1 : Lars is stealing, document 2: Lars is a judge, document 3: Rolf is stealing.

The first step is to create a semantic space. In this example LSA is used, but semantic spaces can also be created using several other methods, such as probabilistic latent semantic analysis, random indexing or ICA. First a context by word frequency table of the words included in our corpora is made, where the words are represented in the rows and the contexts in the columns, as indicated in table 1 below.

Table 1 : Word frequency table (matrix)

Word/Contexts document 1 document 2 document 3

Lars 1 1 0

Is 1 1 1

Stealing 1 0 1

Judge 0 1 0

Rolf 0 0 1

In a word frequency table, high frequency words not containing any semantic information (e.g., "a" and "the") are not present. To improve performance, the frequency table may be normalized by taking the logarithm of the frequency, but this step is here omitted for simplicity. Each cell represents the number of occurrence of a word in the context. By context is meant either a document or subset of a document.

To create a semantic space, singular value decomposition (SVD) (Golub & Kahan, 1965) is conducted. The method of performing singular value decomposition is known within the field of linear algebra and is a standard package in e.g the commercially available linear algebra package LAPACK or in the GNU Scientific Library (Anderson, et al, 1999).

The following variables are written in matrix notation, where x is the context by word frequency table (the frequency matrix of Table 1), u is the semantic space, and s is the singular values. The SVD computes an approximation of x, labeled x': x' = u * S * V

where u, s and v can be calculated from x by applying the known algorithm of SVD:

[u s v] = SVD(x)

For details of how to calculate SVD see (Golub & Kahan, 1965).

The columns of u represent the dimensions in the space and the rows represent the words. Each word is normalized to a length of 1. This is done by calculating the length of the vector representing each word and dividing the dimension values of the word with this length:

U₁' = U₁ / Il U₁ Il

where U₁ represents the semantic representation of word i and || U₁ 1| is the length of vector U₁. Hence, u' contains the normalized values of u. For example, if ui =[1 2], then the normalized vector with a length of one is m' = [1 2]/(l² + 2²)^1/2=[5^"1/2 2*5^~1/2].

A feature of the SVD algorithm is that it orders the dimensions in u by how important they are in predicting x', so that the explained variance of the first dimensions are larger than the later dimensions. The dimensions represent features in the semantic space. To understand what features that are represented, it is necessary to make an interpretation of the dimensions. For example, Table 2 shows the first two dimensions of u' following normalization:

Table 2: the normalized semantic space (u')

Word/Dimensions 1 2

Lars -.68 0.73

Is -1.00 0.00

Stealing -.68 -.73

Judge -.39 0.92

Rolf -.39 -0.92

The statements in the semantic space are then summarized. This summary is made by averaging the corresponding vectors in the semantic space, and then normalizing the results so that the length of the resulting vector is one. For example, the semantic representation of 'Lars' [-0.68 0.73] and 'is' [-1.00 0.00] can be averaged, and then normalized to a length of 1, and the results is [-0.92, .40]. The semantic space can now be used to make a prediction (P) of the correctness of a statement (V)

A prediction of correctness (P) can be made by using multiple linear regressions, where we find the coefficients (R) that best describes the linear relation between the semantic space (u') and the known value of correctness of the statements (V):

V ~ R * u'

Following the well-known algorithm for solving multiple linear regression, R can be calculated by:

For example, assume that we have access to an eyewitness corpus (see first column of Table 3) and that the correctness (V) of the first two statements, but correctness of statement three is unknown (see second column in Table 3). Notice, that this eyewitness corpus does not need to be as large as the corpus that the semantic space is created from. However, to avoid overfitting the number of statements should be at least three times more than the number of dimensions used in the space.

The statements are first summarized by averaging the semantic representation of the words in the statements, and then normalizing, as described above. By applying the formula above on the first two statements, the following coefficients are obtained R = [1.0 0.0 -1.61] (where the first number represents a constant that is added to the prediction and the following numbers correspond to coefficients for dimension 1 and 2 respectively). The predicted correctness of all statements (P) can then be calculated by the following formula:

P = u⁹ * R

This formula can now be used to predict the last statements with an unknown correctness (and which has not been used during training).

Table 3

Table 3 shows the words in the corpora, the known correctness of the statements V (O=incorrect, l=correct) and the predicted correctness (P). The correlation between predicted variable and external variable is 0.87. In this example, we would conclude that the statement "Rolf is stealing" is more likely to be correct than false because P is closer to 1 than to 0 (i.e. P = 2).

The calculations underlying this example are disclosed in more detail in the following appendix:

Appendix

This Appendix implements the invention, using the examples and numbers described above. The implementation is made in standard Matlab code, and is commented. Commenting lines commence with the sign "%". The produced output follows the code.

function patentExample

%Word by context representation (x) of the words: %'Lars' (1), 'is' (2), 'stealing' (3), 'judge'(4), and 'rolf (5) x=[l 1 0;l 1 l;1 0 l;0 1 0;0 0 1]

"/{.Calculations of the SVD [u s v] = svd(x)

%Select the first two dimensions of u, and normalize each vector to a length of 1 u=normalize(u(:, 1 :2))

%Averaging and normalizing 'Lars is' normalize(u( 1 , :)+u(2, :))

%Creating semantic representation of eyewitness statements (u2) u2(l,:)=u(l,:)+u(2,:)+u(3,:);%'lars is stealing' u2(2,:)=u(l,:)+u(2,:)+u(4,:);%'lars is judge' u2(3 , :)=u(5, :)+u(2, :)+u(3 , :);%'rolf is stealing' u2=normalize(u2);%Normalize

% Setting accuracy of statements (first two known) V=[I; 0; I];

%Adding the constant 1 to u2 U=[ones(length(V),l) u2]

%Solving for R (regressions coefficients for known V, in V=R*U R=U(1 :2,:)\V( 1 :2) %Predicting accuracy (P) P=U*R

%Correlating predicted and known accuracy (known accuracy for statement 3 is assumed to

%be 1) corr(P,V) function u=normalize(u)

%Normalize the length of each vector to 1

[Nl N2]=size(u); for i=l :Nl u(i,:)=u(i,:)/sum(u(i,:).^Λ2)Λ5; end

» patentExample x =

1 1 0 1 1 1 1 0 1 0 1 0

0 0 1

u =

-0 .4692 0. 5000 -0, .3935 0. 0836 0. 6066

-0 .6838 0. 0000 0. 1800 -0. 6277 -0. ,3256

-0 .4692 -0. 5000 -0 .3935 0 .5441 -0 .2810

-0 .2146 0. 5000 0. 5735 0. 5441 -0. 2810

-0 .2146 -0. 5000 0, .5735 0. 0836 0. 6066

S =

2.5243 0 0 0 1.4142 0 0 0 0.7923 0 0 0 0 0 0

V =

-0.6426 0 -0.7662 -0.5418 0.7071 0.4544 -0.5418 -0.7071 0.4544 U =

-0.6843 0.7292 -1.0000 0.0000 -0.6843 -0.7292 -0.3944 0.9189 -0.3944 -0.9189

ans =

-0.9177 0.3973

U =

1.0000 -1.0000 0.0000 1.0000 -0.7836 0.6213 1.0000 -0.7836 -0.6213

R

1.0000

0 -1.6096

P =

1.0000 -0.0000 2.0000

ans =

0.8660 References

Allwood, C. M., Ask, K., & Granhag, P. A. ((2005). The Cognitive Interview: Effects on the realism in witnesses&CloseCurlyQuote; confidence in their free recall. Psychology, crime & law, 11(2), 183-198. Anderson, E., Bai, Z., Bischof, C, Blackford, S., Demmel, J., Dongarra, J., et al. (1999).

LAPACK Users' Guide (Third ed.). . Society for Industrial and Applied Mathematics.

ISBN 0898714478. Brewer, N., & Burke, A. (2002). Effects of testimonial inconsistencies and eyewitness confidence on mock-juror judgments. Law and Human Behavior, 26(3), 353-364. Brewer, N., Potter, R., Fisher, R. P., Bond, N., & Luszcz, M. A. (1999). Beliefs and data on the relationship between consistency and accuracy of eyewitness testimony. Applied

Cognitive Psychology, 13(4), 297-313. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). NJ: Lawrence

Erlbaum Associates. Fisher, R. P., & Schreiber, N. (2007). Interview protocols for improving eyewitness memory

The handbook of eyewitness psychology, VoI I: Memory for events, (pp. 53-80):

Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers. Golub, G. H., & Kahan, W. (1965). Calculating the singular values and pseudo-inverse of a matrix Journal of the Society for Industrial and Applied Mathematics: Series B,

Numerical Analysis, 2(2), 205-224. Hertwig, R., Gigerenzer, G., & Hoffrage, U. (1997). The reiteration effect in hindsight bias.

Psychological Review, 104(1), 194-202. Innocent project (n.d.). Facts on post-conviction DNA exonerations. Retrieved 6th of April,

2010, from the web: http://www.innocenceproject.org/Content/Facts_on_PostConviction_DNA_Exoneratio ns.php Juslin, P., Olsson, N., & Winman, A. (1996). Calibration and diagnosticity of confidence in eyewitness identification: Comments on what can be inferred from the low confidence-accuracy correlation. Journal of Experimental Psychology: Learning,

Memory, and Cognition, 22(5), 1304-1316. Kulkofsky, S. (2008). Credible but inaccurate: Can Criterion-Based Content Analysis

(CBCA) distinguish true and false memories? Child sexual abuse: Issues and challenges, (pp. 21-42): Hauppauge, NY, US: Nova Science Publishers. Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge.

Psychological Review, 104(2), 211-240. Latent semantic analysis (2007). Loftus, E. F., & Hoffman, H. G. (1989). Misinformation and memory: The creation of new memories. Journal of experimental psychology. General, 118(1), 100-104. Luus, C. A. E., & Wells, G. L. (1994). The malleability of eyewitness confidence: Co-witness and perserverance effects. Journal of Applied Psychology, 79(5), 714-724. Meyer, D., Leisch, F., & Hornik, K. (2003). The support vector machine under test. .

Neurocomputing 55(1-2), 169-186.

Sahlgren, M. (2007). An Introduction to Random Indexing. Stockholm university, Stockholm. Shaw-III, J. S., Garven, S., & Wood, J. M. (1997). Co-Witness Information Can Have

Immediate Effects on Eyewitness Memory Reports. Law and Human Behavior, 21(5),

503-523.

Claims

1. A method for predicting correctness of statements, comprising: collecting a text corpus comprising a set of words; generating a representation of the text corpus; creating a semantic space for the set of words, summarizing statements in the semantic space, training on a set of training statements where the correctness is known to identify a prediction model, applying the model to a new statement whose correctness is to be predicted.

2. The method according to claim 1, wherein said text corpus is an electronically structured and processed set of texts, and wherein said text corpus contains texts in a single language which is the same language as is used for said new statement whose correctness is to be predicted.

3. The method according to claim 1 or 2, wherein said semantic space is calculated using an algorithm selected from the group of LSD or RI followed by compression using the algorithm SVD.

4. The method according to claim 3, wherein the frequency is normalized by taking a logarithm of the frequency when creating a semantic space using the LSA algorithm.

5. The method according to any of claims 1 - 4 wherein said text corpus comprise more than ten times, preferable more than fifty times as many words as said statement whose correctness is to be predicted.

6. A system for predicting a value of a variable associated with a target word, said system comprising at least one computer, wherein said system is configured to: collect a text corpus comprising a set of words; generate a representation of the text corpus; create a semantic space for the set of words, based on the representation of the text corpus; train on a set of training statements where the correctness is known to identify a prediction model; and apply the model to a new statement whose correctness is to be predicted.

7. A computer readable medium having stored thereon a computer program having software instructions which when run on a computer cause the computer to perform the steps of: collecting a text corpus comprising a set of words that include the target word; generating a repres create a semantic space for the set of words, based on the representation of the text corpus; training on a set of training statements where the correctness is known to identify a prediction model; and applying the model to a new statement whose correctness is to be predicted.