0% found this document useful (0 votes)
60 views

Function: (H, Pvalue, Ksstatistic) Kstest2 (X1, X2, Alpha, Tail)

The document describes the kstest2 function in MATLAB, which performs a two-sample Kolmogorov-Smirnov test to determine if two independent random samples come from the same continuous population. It tests the null hypothesis that the two underlying cumulative distribution functions are equal against alternatives that they are unequal, that one is larger than the other, or that one is smaller. It returns values indicating whether to reject the null hypothesis, the p-value, and the test statistic.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Function: (H, Pvalue, Ksstatistic) Kstest2 (X1, X2, Alpha, Tail)

The document describes the kstest2 function in MATLAB, which performs a two-sample Kolmogorov-Smirnov test to determine if two independent random samples come from the same continuous population. It tests the null hypothesis that the two underlying cumulative distribution functions are equal against alternatives that they are unequal, that one is larger than the other, or that one is smaller. It returns values indicating whether to reject the null hypothesis, the p-value, and the test statistic.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

function [H, pValue, KSstatistic] = kstest2(x1, x2, alpha, tail) %KSTEST2 Two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test.

% H = KSTEST2(X1,X2,ALPHA,TYPE) performs a Kolmogorov-Smirnov (K-S) test % to determine if independent random samples, X1 and X2, are drawn from % the same underlying continuous population. ALPHA and TYPE are optional % scalar inputs: ALPHA is the desired significance level (default = 0.05); % TYPE indicates the type of test (default = 'unequal'). H indicates the % result of the hypothesis test: % H = 0 => Do not reject the null hypothesis at significance level ALPHA. % H = 1 => Reject the null hypothesis at significance level ALPHA. % % Let S1(x) and S2(x) be the empirical distribution functions from the % sample vectors X1 and X2, respectively, and F1(x) and F2(x) be the % corresponding true (but unknown) population CDFs. The two-sample K-S % test tests the null hypothesis that F1(x) = F2(x) for all x, against the % alternative specified by TYPE: % 'unequal' -- "F1(x) not equal to F2(x)" (two-sided test) % 'larger' -- "F1(x) > F2(x)" (one-sided test) % 'smaller' -- "F1(x) < F2(x)" (one-sided test) % % For TYPE = 'unequal', 'larger', and 'smaller', the test statistics are % max|S1(x) - S2(x)|, max[S1(x) - S2(x)], and max[S2(x) - S1(x)], % respectively. % % The decision to reject the null hypothesis occurs when the significance % level, ALPHA, equals or exceeds the P-value. % % X1 and X2 are vectors of lengths N1 and N2, respectively, and represent % random samples from some underlying distribution(s). Missing % observations, indicated by NaNs (Not-a-Number), are ignored. % % [H,P] = KSTEST2(...) also returns the asymptotic P-value P. % % [H,P,KSSTAT] = KSTEST2(...) also returns the K-S test statistic KSSTAT % defined above for the test type indicated by TYPE. % % The asymptotic P-value becomes very accurate for large sample sizes, and % is believed to be reasonably accurate for sample sizes N1 and N2 such % that (N1*N2)/(N1 + N2) >= 4. % % See also KSTEST, LILLIETEST, CDFPLOT. % % Copyright 1993-2008 The MathWorks, Inc. % $Revision: 1.5.2.6 $ $ Date: 1998/01/30 13:45:34 $

% References: % Massey, F.J., (1951) "The Kolmogorov-Smirnov Test for Goodness of Fit", % Journal of the American Statistical Association, 46(253):68-78. % Miller, L.H., (1956) "Table of Percentage Points of Kolmogorov Statistics", % Journal of the American Statistical Association, 51(273):111121. % Stephens, M.A., (1970) "Use of the Kolmogorov-Smirnov, Cramer-Von Mises and % Related Statistics Without Extensive Tables", Journal of the Royal % Statistical Society. Series B, 32(1):115-122. % Conover, W.J., (1980) Practical Nonparametric Statistics, Wiley. % Press, W.H., et. al., (1992) Numerical Recipes in C, Cambridge Univ. Press. if nargin < 2 error('stats:kstest2:TooFewInputs','At least 2 inputs are required.'); end % % Ensure each sample is a VECTOR. % if ~isvector(x1) || ~isvector(x2) error('stats:kstest2:VectorRequired','The samples X1 and X2 must be vectors.'); end % % Remove missing observations indicated by NaN's, and % ensure that valid observations remain. % x1 x2 x1 x2 = = = = x1(~isnan(x1)); x2(~isnan(x2)); x1(:); x2(:);

if isempty(x1) error('stats:kstest2:NotEnoughData', 'Sample vector X1 contains no data.'); end if isempty(x2) error('stats:kstest2:NotEnoughData', 'Sample vector X2 contains no data.'); end % % Ensure the significance level, ALPHA, is a scalar % between 0 and 1 and set default if necessary. %

if (nargin >= 3) && ~isempty(alpha) if ~isscalar(alpha) || (alpha <= 0 || alpha >= 1) error('stats:kstest2:BadAlpha',... 'Significance level ALPHA must be a scalar between 0 and 1.'); end else alpha = 0.05; end % % Ensure the type-of-test indicator, TYPE, is a scalar integer from % the allowable set, and set default if necessary. % if (nargin >= 4) && ~isempty(tail) if ischar(tail) tail = strmatch(lower(tail), {'smaller','unequal','larger'}) - 2; if isempty(tail) error('stats:kstest2:BadTail',... 'Type-of-test indicator TYPE must be ''unequal'', ''smaller'', or ''larger''.'); end elseif ~isscalar(tail) || ~((tail==-1) || (tail==0) || (tail==1)) error('stats:kstest2:BadTail',... 'Type-of-test indicator TYPE must be ''unequal'', ''smaller'', or ''larger''.'); end else tail = 0; end % % Calculate F1(x) and F2(x), the empirical (i.e., sample) CDFs. % binEdges binCounts1 binCounts2 sumCounts1 sumCounts2 sampleCDF1 sampleCDF2 = = = = = = = [-inf ; sort([x1;x2]) ; inf]; histc (x1 , binEdges, 1); histc (x2 , binEdges, 1); cumsum(binCounts1)./sum(binCounts1); cumsum(binCounts2)./sum(binCounts2); sumCounts1(1:end-1); sumCounts2(1:end-1);

% % Compute the test statistic of interest. % switch tail case 0 deltaCDF

% =

2-sided test: T = max|F1(x) - F2(x)|. abs(sampleCDF1 - sampleCDF2);

case -1 deltaCDF case 1 deltaCDF end KSstatistic =

% = % =

1-sided test: T = max[F2(x) - F1(x)]. sampleCDF2 - sampleCDF1; 1-sided test: T = max[F1(x) - F2(x)]. sampleCDF1 - sampleCDF2;

max(deltaCDF);

% % Compute the asymptotic P-value approximation and accept or % reject the null hypothesis on the basis of the P-value. % n1 n2 n lambda = = = = length(x1); length(x2); n1 * n2 /(n1 + n2); max((sqrt(n) + 0.12 + 0.11/sqrt(n)) * KSstatistic , 0); % 1-sided test. exp(-2 * lambda * lambda); % 2-sided test (default). asymptotic Q-function to approximate the 2-sided P-value. = = = (1:101)'; 2 * sum((-1).^(j-1).*exp(-2*lambda*lambda*j.^2)); min(max(pValue, 0), 1);

if tail ~= 0 pValue else % % Use the % j pValue pValue end H = =

(alpha >= pValue);

funcin de [H, pValue, KSstatistic] = kstest2 (x1, x2, alfa, cola) % KSTEST2 dos muestras de Kolmogorov-Smirnov de bondad de ajuste de prueba de hiptesis. % H = KSTEST2 (X1, X2, ALPHA, TIPO) realiza una prueba de Kolmogorov-Smirnov (KS) prueba % Para determinar si muestras aleatorias independientes, X1 y X2, se han extrado de % Del mismo continuo de la poblacin subyacente. ALPHA y el tipo son opcionales Entradas% escalar: ALPHA es el nivel de significacin deseado (por defecto = 0,05); % TYPE indica el tipo de prueba (default = "desigual"). H indica la % Resultado de la prueba de hiptesis: % H = 0 => No se rechaza la hiptesis nula al nivel de significancia ALPHA. % H = 1 => Rechazar la hiptesis nula al nivel de significancia ALPHA. % % Vamos a S1 (x) y S2 (x) las funciones de distribucin emprica de la % De la muestra vectores X1 y X2, respectivamente, y F1 (x) y F2 (x) la % Correspondiente verdadero (pero desconocido) CDFs poblacin. La muestra de dos KS Prueba% la hiptesis nula de que la F1 (x) = F2 (x) para todo x, en contra de la % Alternativas especificando el tipo: % "Desigual" - "F1 (x) no es igual a F2 (x)" (dos caras de prueba) % 'Grandes' - "F1 (x)> F2 (x)" (prueba unilateral) % "Ms pequeas" - "F1 (x) <F2 (x)" (prueba unilateral) % % Para el tipo = 'desigual', 'ms grande', y 'menor', la prueba estadstica se % Max | S1 (x) - S2 (x) |, max [S1 (x) - S2 (x)], y un mximo de [S2 (x) - S1 (x)], %, Respectivamente. % % La decisin de rechazar la hiptesis nula se produce cuando la importancia % De nivel, ALPHA, igual o superior al P-valor. % % X1 y X2 son vectores de longitudes de N1 y N2, respectivamente, y representan % Muestras al azar de una distribucin subyacente (s). Que falta Observaciones%, indicada por NaN (Not-a-Number), se ignoran. % % [H, P] = KSTEST2 (...) tambin devuelve el valor asinttico P-P. % % [H, P, KSSTAT] = KSTEST2 (...) tambin devuelve la prueba de KS KSSTAT estadstica % Se define anteriormente para el tipo de prueba indicada por el tipo. % El asinttica% P-valor llega a ser muy preciso para muestras de gran tamao, y % Se cree que es razonablemente precisa de los tamaos de muestra N1 y N2 como % Que (N1 * N2) / (N1 + N2)> = 4. %

Ver tambin KSTEST%, LILLIETEST, CDFPLOT. % % Copyright 1993-2008 The MathWorks, Inc. % $ Revision: 1.5.2.6 Fecha de $ $: 30/01/1998 13:45:34 $ Las referencias%: % Massey, FJ, (1951) "La prueba de Kolmogorov-Smirnov de bondad de ajuste", % Journal of the American Statistical Association, 46 (253) :68-78. % Miller, LH, (1956) "Tabla de puntos porcentuales de Estadsticas de Kolmogorov", % Journal of the American Statistical Association, 51 (273) :111-121. % Stephens, MA, (1970) "El uso de la prueba de Kolmogorov-Smirnov, Cramer-Von Mises y Estadsticas relacionadas% sin tablas extensas ", Revista de la Real % Statistical Society. Serie B, 32 (1) :115-122. Conover%, WJ, (1980) Prctica Estadstica no paramtrica, Wiley. Pulse%, W.H., et. al., (1992) Numerical Recipes in C, Cambridge University.Prensa. si nargin <2 error ('Estadsticas: kstest2: TooFewInputs', '. Por lo menos dos entradas son requeridos'); final % % Asegrese de que cada muestra es un vector. % si ~ isvector (x1) | | ~ isvector (x2) error ('Estadsticas: kstest2: VectorRequired', '. Las muestras de X1 y X2 deben ser vectores'); final % Quitar% faltan observaciones indicadas por NaN, y % Que las observaciones siguen siendo vlidas. % x1 = x1 (~ isNaN (x1)); x2 = x2 (~ isNaN (x2)); x1 = x1 (:); x2 = x2 (:); si IsEmpty (x1) error ('Estadsticas: kstest2: NotEnoughData', 'Muestra vector X1 no contiene datos.');

final si IsEmpty (x2) error ('Estadsticas: kstest2: NotEnoughData', 'Muestra vector X2 no contiene datos.'); final % % Asegrese de que el nivel de significacin alfa, es un escalar % Entre 0 y 1 y establecer por defecto si es necesario. % if (nargin> = 3) & & ~ IsEmpty (alfa) si ~ isscalar (alpha) | | (alfa <= 0 | | alfa> = 1) error ('Estadsticas: kstest2: BadAlpha',... "ALPHA Nivel de significacin debe ser un escalar entre 0 y 1. '); final ms alfa = 0,05; final % % Asegrese de que el indicador del tipo de prueba, el tipo, es un nmero entero de escalar % Del conjunto permitido, y el defecto si es necesario. % if (nargin> = 4) & & ~ IsEmpty (cola) si ischar (cola) cola = strmatch (ms baja (la cola), {'ms pequeo', 'desigual', 'ms grande'}) - 2; si IsEmpty (cola) error ('Estadsticas: kstest2: BadTail',... 'Tipo de prueba de tipo indicador debe ser''desigual'',''''ms pequea, o grande''''.'); final elseif ~ isscalar (cola) | | ~ ((cola ==- 1) | | (tail == 0) | | (tail == 1)) error ('Estadsticas: kstest2: BadTail',... 'Tipo de prueba de tipo indicador debe ser''desigual'',''''ms pequea, o grande''''.'); final ms cola = 0; final % % Calcular F1 (x) y F2 (x), el emprico (es decir, de la muestra) los dibenzofuranos

policlorados. % binEdges = [-inf; sort ([x1, x2]); inf]; binCounts1 histc = (x1, binEdges, 1); binCounts2 histc = (x2, binEdges, 1); sumCounts1 = cumSum (binCounts1) / sum (binCounts1).; sumCounts2 = cumSum (binCounts2) / sum (binCounts2).; sampleCDF1 sumCounts1 = (1: end-1); sampleCDF2 sumCounts2 = (1: end-1); % % Calcular la prueba estadstica de inters. % interruptor de la cola caso de 0% a 2 caras de prueba: T = max | F1 (x) - F2 (x) |. deltaCDF = abs (sampleCDF1 - sampleCDF2); caso de -1% a 1 cara test: T = max [F2 (x) - F1 (x)]. deltaCDF = sampleCDF2 - sampleCDF1; caso del 1% de 1 cara de prueba: T = max [F1 (x) - F2 (x)]. deltaCDF = sampleCDF1 - sampleCDF2; final KSstatistic = max (deltaCDF); % % Calcular el valor de P asinttica aproximacin y aceptar o % Rechazar la hiptesis nula sobre la base de la P-valor. % n1 = longitud (x1); n2 = longitud (x 2); n = n1 * n2 / (n1 + n2); lambda = max (sqrt ((n) + 0.12 + 0.11/sqrt (n)) * KSstatistic, 0); si la cola ~ = 0% a 1 cara de prueba. pValue = exp (-2 * * lambda lambda);

else% a 2 caras de prueba (por defecto). % % Utilice el asinttica Q-funcin para aproximar las dos caras p-valor. % j = (1:101); pValue = 2 * sum ((-1) ^ (j-1) .* exp (-2 * * lambda lambda * j. ^ 2).) pValue = min (max (pValue, 0), 1); final H = (alpha> = pValue);

You might also like