0% found this document useful (0 votes)
118 views12 pages

Package Abcanalysis': R Topics Documented

This R package provides functions for computed ABC analysis, which divides a dataset into three subsets (A, B, C) based on profitability. The ABC curve visualizes the cumulative distribution of data values and is used to calculate optimal thresholds for assigning points to subsets. Subset A contains the most profitable values, subset B where profit equals effort, and subset C the least profitable values. The package calculates ABC analysis and plots the ABC curve, with options to show division lines and points of interest.

Uploaded by

jbsimha3629
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views12 pages

Package Abcanalysis': R Topics Documented

This R package provides functions for computed ABC analysis, which divides a dataset into three subsets (A, B, C) based on profitability. The ABC curve visualizes the cumulative distribution of data values and is used to calculate optimal thresholds for assigning points to subsets. Subset A contains the most profitable values, subset B where profit equals effort, and subset C the least profitable values. The package calculates ABC analysis and plots the ABC curve, with options to show division lines and points of interest.

Uploaded by

jbsimha3629
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Package ABCanalysis

August 23, 2016


Type Package
Title Computed ABC Analysis
Version 1.1.2
Date 2016-08-22
Author Michael Thrun, Jorn Lotsch, Alfred Ultsch
Maintainer Michael Thrun <[email protected]>
Description For a given data set, the package provides a novel method of computing precise limits to acquire subsets which are easily interpreted. Closely related to the Lorenz curve, the ABC curve visualizes the data by graphically representing the cumulative distribution function. Based on an ABC analysis the algorithm calculates, with the help of the ABC curve, the optimal limits by exploiting the mathematical properties pertaining to distribution of analyzed items. The data containing positive values is divided into three disjoint subsets A, B and C, with subset A comprising very profitable values, i.e. largest data values (``the important few''), subset B comprising values where the yield equals to the effort required to obtain it, and the subset C comprising of nonprofitable values, i.e., the smallest data sets (``the trivial many''). Package is based on ``Computed ABC Analysis for rational Selection of most informative Variables in multivariate Data'', PLoS One. Ultsch. A., Lotsch J. (2015) <DOI:10.1371/journal.pone.0129767>.
Imports plotrix
Depends R (>= 2.10)
License GPL-3
LazyLoad yes
URL https://www.uni-marburg.de/fb12/datenbionik/software-en
NeedsCompilation no
Repository CRAN
Date/Publication 2016-08-23 14:57:47

R topics documented:
ABCanalysis-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ABCanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1

2
3

ABCanalysis-package
ABCanalysisPlot . . . .
ABCcleanData . . . . .
ABCcurve . . . . . . . .
ABCplot . . . . . . . . .
ABCRemoveSmallYields
calculatedABCanalysis .
SwissInhabitants . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

Index

ABCanalysis-package

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

. 4
. 6
. 7
. 8
. 9
. 10
. 11
12

Computed ABC analysis

Description
Computed ABC Analysis allows the optimal calculation of three disjoint subsets A,B,C in data sets
containing positive values:
subset A containing few most profitable values, i.e. largest data values ("the important few"), subset
B containing data, where the profit gain equals effort required to obtain this gain, and the subset C
of non-profitable values, i.e. the smallest data sets ("the trivial many").
This package calculates the three subsets A, B and C by means of an algorithm based on statistically
valid definitions of thresholds for the three sets A,B and C.
Note
Check out our new Umatrix package for visualisation and clustering of high-dimensional data on
our Webpage.
Author(s)
Michael Thrun, Jorn Lotsch, Alfred Ultsch
http://www.uni-marburg.de/fb12/datenbionik
<[email protected]>
References
Ultsch. A ., Lotsch J.: Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data, PloS one, Vol. 10(6), pp. e0129767. doi 10.1371/journal.pone.0129767,
2015.
Examples
data("SwissInhabitants")
abc=ABCanalysis(SwissInhabitants,PlotIt=TRUE)
SetA=SwissInhabitants[abc$Aind]
SetB=SwissInhabitants[abc$Bind]
SetC=SwissInhabitants[abc$Cind]

ABCanalysis

ABCanalysis

Computed ABC analysis: calculates a division of the data in 3 classes


A, B and C

Description
divide the Data in 3 classes A, B and C such that
A=Data[Aind] : with low effort much yield
B=Data[Bind] : yield and effort are about equal
C=Data[Cind] : with much effort low yield
Usage
ABCanalysis(Data,ABCcurvedata,PlotIt=FALSE)
Arguments
Data

vector(1:n) describes an array of data: n cases in rows of one variable, if matrix


or dataframe then first column will be used.

ABCcurvedata

only for internal usage, list from ABCcurve

PlotIt

default(FALSE), if variable is used, a plot is made, set with arbitrary value

Details
Pareto point: Minimum distance to (0,1) = minimal unrealized potential
BreakEven Point: B_x is the x value of the point, where the slope of ABCcurve equals one.
For further description to p in variable AlimitIndInInterpolation see ABCcurve
Value
Output is of type list which parts are described in the following
Aind

vector [1:j], A==Data(Aind) : with little effort much Yield

Bind

vector [1:l], B==Data(Bind) : effort and Yield are balanced

Cind

(vector [1:m], C==Data(Cind) : much effort for little Yield

ABexchanged

Boolean, TRUE if Point A is the Break Even and point B is the Pareto Point,
FALSE otherwise

c(Ax,Ay), Pareto point or BreakEven Point indicated by ABexchanged

c(Bx,By), Pareto point or BreakEven Point indicated by ABexchanged

Submarginal point: minimum distance to [B_x,1]

smallestAData

Boundary AB, defined by point A or B with ABexchanged

smallestBData

Boundary BC, defined by point C

ABCanalysisPlot
AlimitIndInInterpolation
index of AB Boundary in [p, ABC], the interpolation of the ABC plot
BlimitIndInInterpolation
index of BC Boundary in [p, ABC], the interpolation of the ABC plot

Author(s)
Michael Thrun
http://www.uni-marburg.de/fb12/datenbionik
References
Ultsch. A ., Lotsch J.: Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data, PloS one, Vol. 10(6), pp. e0129767. doi 10.1371/journal.pone.0129767,
2015.
See Also
ABCplot
Examples
data("SwissInhabitants")
abc=ABCanalysis(SwissInhabitants,PlotIt=TRUE)
A=abc$Aind
B=abc$Bind
C=abc$Cind
Agroup=SwissInhabitants[A]
Bgroup=SwissInhabitants[B]
Cgroup=SwissInhabitants[C]

ABCanalysisPlot

Displays ABC plot with ABCanalysis

Description
Displays ABC Curve : cumulative percentage of largest Data (effort) vs cumlative percentage of
sum of largest data (yield) with set limits generated by an calculated ABCanalysis.
Usage
ABCanalysisPlot(Data, LineType = 0, LineWidth = 3,
ShowUniform = TRUE,title, limits = TRUE, MarkPoints = TRUE,
ABCcurvedata,ResetPlotDefaults=TRUE)

ABCanalysisPlot

Arguments
Data

vector[1:n] describes an array of data: n cases in rows of one variable

LineType

integer, optional, for plot default: LineType=0 for solid line; for other line codes
see documentation about pch

LineWidth

integer, optional, width of Line, see lwd in par

ShowUniform

boolean, optional, the ABC curve of the uniform distribution is shown in plot if
TRUE (default)

title

string, optional, see parameter main in plot

limits

boolean, = TRUE, lines of division in A, B and C are drawn, default = FALSE

MarkPoints

boolean, optional, default= TRUE, Mark the three points of interest

ABCcurvedata
optional, see ABCcurve
ResetPlotDefaults
optional, default =TRUE. If ResetPlotDefaults=FALSE, multiple plots in one
window possible, but no resetting of plot to default parameters.
Value
object is a list of items with
ABC

Output of ABCplot

ABCanalysis

Output of ABCanalysis

Note
The Break Even point is always marked with a green star.
The diagonal from (0,1) to (1,0) is the equilibrium, where effort equals yield.
Author(s)
Michael Thrun
http://www.uni-marburg.de/fb12/datenbionik
See Also
ABCanalysis
Examples
## Standard Example
data("SwissInhabitants")
abc=ABCanalysisPlot(SwissInhabitants)
## Multiple plots in one Window:
m=runif(4,100,200)
s=runif(4,1,10)
Data=sapply(1:4,FUN=function(x,m,s) rnorm(1000,m,s),m,s)
# windows() #screen devices should not be used in examples etc
par(mfrow=c(2,2))

ABCcleanData
for (i in 1:4)
{
ABCanalysisPlot(Data[,i],ResetPlotDefaults=FALSE)
}

Data cleaning for ABC analysis

ABCcleanData

Description
Only the first column of Data is used, anything not beeinh positive numerical value is set to zero
Usage
ABCcleanData(Data)
Arguments
Data

vector[1:n] describes an array of data: n cases in rows of one variable

Details
Data <0 are set to zero, non-numeric values (NA,NaN,etc.) in Data are set to zero strings and chars
are set to zero infinitive numbers are set to max(Data)
Value
Output is of type list whichs parts are described in the following
CleanedData

vector [1:m], columnvector containing Data>=0 and zeros for all NA, NaN and
negative values in Data(1:n)

Data2CleanInd

vector [1:k], Index such that CleanedData = nantozero(Data(Data2CleanInd))

RemovedInd

vector [1:l], Index such that Data(RemovedInd) is the data that has been removed
if RemoveSmallYields==1

Author(s)
http://www.uni-marburg.de/fb12/datenbionik
Michael Thrun

ABCcurve

calculates ABC Curve

ABCcurve

Description
Calculates cumulative percentage of largest data (effort) and cumulative percentages of sum of
largest Data (yield) with spline interpolation (second order, piecewise) of values in-between.

Usage
ABCcurve(Data, p)

Arguments
Data

vector[1:n] describes an array of data: n cases in rows of one variable

optional, an vector of values specifying where interpolation takes place, created


by seq of package base

Value
Output is of type list which parts are described in the following
ABCx

vector [1:k], cumulative population in percent

ABCy

vector [1:k], cumulative high data in percent

Author(s)
Michael Thrun
http://www.uni-marburg.de/fb12/datenbionik

References
Ultsch. A ., Lotsch J.: Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data, PloS one, Vol. 10(6), pp. e0129767. doi 10.1371/journal.pone.0129767,
2015.

ABCplot

ABCplot

displays an ABC Curve as an alternative to an Lorenz curve

Description
Plots cumulative percentage of largest data (effort) vs. cumulative percentage of sum of largest data
(yield)
Usage
ABCplot(Data, LineType = 0, LineWidth = 3, ShowUniform = TRUE,
title, ABCcurvedata,defaultAxes = TRUE)
Arguments
Data

vector[1:n], describes an array of data: n cases in rows of one variable

LineType

for plot default: LineType=0 for a line, other line codes see documentation about
pch in par

LineWidth

integer, width of Line, see lwd in par

ShowUniform

bool, =TRUE: the ABC curve of the uniform distribution is shown in plot

title

string, optional, see parameter main in plot

ABCcurvedata

optional, see ABCcurve

defaultAxes

optional, boolean, see parameter axes in plot

Value
Output is of type list which parts are described in the following
ABCx

vector [1:k], cumulative population in percent

ABCy

vector [1:k], cumulative high Data in percent

Note
The diagonal from (1,0) to (0,1) is the Equilibrium, where effort equals yield
Author(s)
Michael Thrun
http://www.uni-marburg.de/fb12/datenbionik
Examples
data("SwissInhabitants")
vec=ABCplot(SwissInhabitants)

ABCRemoveSmallYields

ABCRemoveSmallYields

Extended Data cleaning for ABC analysis

Description
Only the first column of Data is used, anything not beeing positive numerical value is set to zero
Usage
ABCRemoveSmallYields(Data,CumSumSmallestPercentage)
Arguments
Data

vector[1:n] describes an array of data: n cases in rows of one variable

CumSumSmallestPercentage
(default =0.5),the smallest data up to a cumulated sum of less than CumSumSmallestPercentage
Details
Data <0 are set to zero, non-numeric values (NA,NaN,etc.) in Data are set to zero strings and chars
are set to zero infinitive numbers are set to max(Data) the smallest data up to a cumulated sum of
less than CumSumSmallestPercentage of the total sum (yield) is removed
Value
Output is of type list whichs parts are described in the following
SubstantialData
columnvector containing Data>=0 and zeros for all NaN and negative values in
Data(1:n)
Data2CleanInd

Index such that SubstantialData = nantozero(Data(Data2SubstantialInd))

RemovedInd

Data(RemovedInd) is the data that has been removed

Author(s)
http://www.uni-marburg.de/fb12/datenbionik
Michael Thrun

10

calculatedABCanalysis

calculatedABCanalysis Computed ABC analysis: calculates a division of the data in 3 classes


A, B and C

Description
divide the Data in 3 classes A, B and C such that
A=Data[Aind] : with low effort much yield
B=Data[Bind] : yield and effort are about equal
C=Data[Cind] : with much effort low yield
Usage
calculatedABCanalysis(Data)
Arguments
Data

vector(1:n) describes an array of data: n cases in rows of one variable, if matrix


or dataframe then first column will be used.

Details
Pareto point: Minimum distance to (0,1) = minimal unrealized potential
BreakEven Point: B_x is the x value of the point, where the slope of ABCcurve equals one.
For further description to p in variable AlimitIndInInterpolation see ABCcurve
Value
Output is of type list which parts are described in the following
Aind

vector [1:j], A==Data(Aind) : with little effort much Yield

Bind

vector [1:l], B==Data(Bind) : effort and Yield are balanced

Cind

(vector [1:m], C==Data(Cind) : much effort for little Yield

smallestAData

Boundary AB, defined by point A or B with ABexchanged

smallestBData

Boundary BC, defined by point C

Author(s)
Michael Thrun
http://www.uni-marburg.de/fb12/datenbionik
References
Ultsch. A ., Lotsch J.: Computed ABC Analysis for Rational Selection of Most Informative Variables in Multivariate Data, PloS one, Vol. 10(6), pp. e0129767. doi 10.1371/journal.pone.0129767,
2015.

SwissInhabitants

11

See Also
ABCanalysis
Examples
data("SwissInhabitants")
abc=calculatedABCanalysis(SwissInhabitants)
A=abc$Aind
B=abc$Bind
C=abc$Cind
Agroup=SwissInhabitants[A]
Bgroup=SwissInhabitants[B]
Cgroup=SwissInhabitants[C]

SwissInhabitants

SwissInhabitants in 1900

Description
Number of inhabitants in the 2896 villages of Switzerland in the year 1900.
Usage
data("SwissInhabitants")
Details
This data set consists of the number of inhabitants in the 2896 communes, i.e. cities and villages, in
the year 1900. The individual count is the total number of persons living in the particular commune.
The data set is unordered for anonymity reasons. The data set has been used as part of a larger data
set to identify patterns of concentration in Switzerland (see reference).
Source
Schuler,M., Ullmann, D. Eidgenossische Volkszahlung:Bevoelkerungsentwicklung der Gemeinden,
Bundesamt fur Statistik, Neuchatel, Switzerland, 2002
References
Behnisch, M., Ultsch, A.: Population Patterns in Switzerland 1850-2000, in: Gaul, W. et al (Eds),
Advances in Data Analysis, Data Handling and Business Intelligence, Springer, Heidelberg, pp.
163-173, 2010.
Examples
data(SwissInhabitants)
## maybe str(SwissInhabitants) ; plot(SwissInhabitants) ...

Index
Topic ABC analysis
ABCanalysis, 3
ABCanalysisPlot, 4
ABCplot, 8
calculatedABCanalysis, 10
Topic ABC curve
ABCcurve, 7
Topic ABCanalysis
ABCanalysis, 3
ABCanalysisPlot, 4
calculatedABCanalysis, 10
Topic ABCcurve
ABCcurve, 7
Topic ABC
ABCanalysis, 3
ABCplot, 8
calculatedABCanalysis, 10
Topic Computed ABC analysis
calculatedABCanalysis, 10
Topic Lorenz curve
ABCanalysis, 3
ABCcurve, 7
ABCplot, 8
calculatedABCanalysis, 10
Topic Lorenz
ABCanalysis, 3
ABCcurve, 7
ABCplot, 8
calculatedABCanalysis, 10
Topic

ABCcurve, 3, 5, 7, 8, 10
ABCplot, 4, 5, 8
ABCRemoveSmallYields, 9
calculatedABCanalysis, 10
dbt.ABC (ABCanalysis-package), 2
dbt.ABCanalyse (ABCanalysis-package), 2
dbt.ABCanalysis (ABCanalysis-package), 2
par, 5, 8
plot, 5, 8
seq, 7
SwissInhabitants, 11
SwissInhabitants1900
(SwissInhabitants), 11

datasets,SwissInhabitants,SwissInhabitants1900
SwissInhabitants, 11
Topic package
ABCanalysis-package, 2
ABCanalyse (ABCanalysis-package), 2
ABCanalysis, 3, 5, 11
ABCanalysis-package, 2
ABCanalysisPlot, 4
ABCcleanData, 6
12

You might also like