0% found this document useful (0 votes)
19 views

Optimization of Heap Sort Using Parallel Processing

This document proposes a methodology to prevent both direct and indirect discrimination in data mining. It aims to develop techniques to remove biases in datasets that could lead to discriminatory decisions based on sensitive attributes like gender, race or religion. The methodology would preprocess datasets to eliminate directly or indirectly discriminatory classification rules while preserving data utility. Experimental evaluations will test the effectiveness of the proposed techniques at removing discrimination biases from original datasets.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Optimization of Heap Sort Using Parallel Processing

This document proposes a methodology to prevent both direct and indirect discrimination in data mining. It aims to develop techniques to remove biases in datasets that could lead to discriminatory decisions based on sensitive attributes like gender, race or religion. The methodology would preprocess datasets to eliminate directly or indirectly discriminatory classification rules while preserving data utility. Experimental evaluations will test the effectiveness of the proposed techniques at removing discrimination biases from original datasets.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

A Methodology for Direct and Indirect

Discrimination Prevention in
Data Mining

M. Sowndarya (10A51A0564) Under the Guidance of


S. William Babu (10A51A05B4) D.T.V .Dharmajee Rao Sir,
J. Satish (10A51A0568)
(Professor).
S. Priyanka(11A55A0505)
Project Information
Idea: To prevent Direct and Indirect
discrimination in Data mining
Developing Environments: Java ,NetBeansIDE,
MySQL.
Outcome: To get the Desired Application status
information requirement using this.
ABSTRACT
 Data mining is an increasingly important technology for extracting useful
knowledge hidden in large collections of data. There are, however, negative
social perceptions about data mining, among which potential privacy invasion
and potential discrimination.
 This consists of unfairly treating people on the basis of their belonging to a
specific group. Automated data collection and data mining techniques such as
classification rule mining have paved the way to making automated decisions,
like loan/granting/denial, insurance premium computation, etc. If the training
data sets are biased in what regards discriminatory (sensitive) attributes like
gender, race, religion, etc., discriminatory decisions may ensue.
 Discrimination can be either direct or indirect.. In this paper, we tackle
discrimination prevention in data mining and propose new techniques
applicable for direct or indirect discrimination prevention individually or both
at the same time.. The experimental evaluations demonstrate that the
proposed techniques are effective at removing direct and/or indirect
discrimination biases in the original data set while preserving
Introduction to discrimination
IN sociology, discrimination is the prejudicial
treatment of an individual based on their membership
in a certain group or category.
There is a list of antidiscrimination acts, which laws
designed to prevent discrimination on the basis of a
number of attributes (e.g., race, religion, gender,
nationality, disability, marital status, and age) in
various settings(e.g., employment and training, access
to public services, credit and insurance, etc.).
 Discrimination can be either direct or indirect (also
calledsystematic). Direct discrimination consists of rules
orprocedures that explicitly mention minority or
disadvantagedgroups based on sensitive discriminatory
attributesrelated to group membership.
 Indirect discrimination consists of rules or procedures that, while
not explicitlymentioning discriminatory attributes, intentionally or
unintentionallycould generate discriminatory
 Indirect discrimination could happen because of thea vailability of
some background knowledge (rules), forexample, that a certain
zip code corresponds to a deterioratingarea or an area with mostly
black population. Thebackground knowledge might be accessible
from publiclyavailable data (e.g., census data) or might be
obtained fromthe original data set itself because of the existence
ofnondiscriminatory attributes that are highly correlated withthe
sensitive ones in the original data set.
Existed System
The statements in existing laws regulations, and legal cases
into quantitative formal counterparts over classification rules
and they introduced a family of measures of the degree of
discrimination of a PD rule. One of these measures is the
extended lift.To provide both direct rule protection (DRP) and
indirect rule protection (IRP) at the same time, an important
point is the relation between the data transformation
methods. Any data transformation to eliminate direct
discriminatory rules should not produce new redlining rules
or prevent the existing ones from being removed. Also any
data transformation to eliminate redlining rules should not
produce new direct _-discriminatory rules or prevent
the existing ones from being removed.
Proposed System
In this paper indirect discrimination will also be referred
to as redlining and rules causing indirectdiscrimination
will be called redlining rules Indirect discrimination
could happen because of the availability of some
background knowledge (rules), for example, that a
certain zip code corresponds to a deteriorating area or
an area with mostly black population. The background
knowledge might be accessible from publicly available
data (e.g., census data) or might be obtained from the
original data set itself because of the existence ofnon
discriminatory attributes that are highly correlated
with the sensitive ones in the original data set.
Uml diagrams(use case)

LOGIN

LOGIN

USER REGISTER

DIRECT DISERIMINATION
MANAGER
USER LOGIN
user

IN DIRECT DISERIMINATION

RULE GENERLIZATION

LOAN STATUS IN GRPHICAL REPRESENTATION

HOME LOAN APPLICATION FROM


Class diagram
USERREGISTER
+id
U LOGIN
+user name
+user name +password
+user pw +e-mail
+phone no
+address

RULEGENERLIZATION

USERLOGIN +applicationnow
USER
+mail-id
+user name +password
+user pass word
+user id

HOME LOAN APPLICATIONFROM


+name of the application
+resent address
+status
+age
+pan no
+religion
+dob
+gender
+pernosnal information()
+employement details()
+properity deteils()

MANAGERLOGIN
+username
+password
+direct diserimination()
+indirect diserimination()
+loan status in grphical representation()
Sequence diagram

USER LOGIN
LOGIN USER REGISTER HOME LOAN APPLICATION MANAGER LOGIN

: user 2 : NEW USER REGISTER()

1 : USER LOGIN()

4 : LOGIN1()

3 : login sucessfulley()

5 : HAVE A USER ID AND PW()

6 : sucessfulley login()

9 : send to manager()

10 : sucessfuley send()
Screen shots
Requirements
Hardware Requirements:
Processor – Pentium III
Speed – 1.1 GHz
RAM – 256 MB (min)
Hard Disk – 20 GB
Software Requirements:
 Operating System :Windows95/98/2000/XP
 Application Server : Tomcat5.0/6.X
 Languages : HTML,JSP,Java Script.
 DataBase : MY SQL
 Database Connectivity :JDBC
Conclusion

we want to explore the relationship between


discrimination prevention and privacy preservation in
data mining. It would be extremely interesting to find
synergies between rule hiding for privacy-preserving
data mining and rule hiding for discrimination removal.
Just as we were able to show that indirect discrimination
removal can help direct discrimination removal, it
remains to be seen whether privacy protection can help
anti discrimination or viceversa. The connection with
current privacy models, like differential privacy, is also
an intriguing
research avenue.
yo u !
ha n k
T

You might also like