PPT6-W6-Big Data Integration
PPT6-W6-Big Data Integration
•
•
Bank’s Partial Schema
Insurance Company’s Partial Schema Accounts(AcctNumber, AcctType, MemberID,
Policies(PolicyKey, PolicyTypeKey, Agent, Conditions) MemberType, TypeID, StartDate, EndDate,
PolicySales(PolicyKey, PolicyholderKey, StartDate, InterestRate, CreditLimit)
TransactKey,Premium,CoveragePeriod, Individuals(MemberID, FName, MI, LName, SSN,
CoverageLimit) Nationality, DoB, LegalStatus,
Transactions(TransactKey, Date, Time, Amount, FullAddress, Phone, PhoneType, Email)
Balance) Corporations(MemberID, Name, RegisteredAddress,
Policyholders(PolicyHolderKey, Name, Address, CorporationType, Signatory1,
City, State, ZIP) Signatory2, DNBNumber, Phone, Email)
Claims(PolicyKey, ClaimKey, TransactKey, Transactions(TrID, AcctNum, Date, Time,
ClaimAmount) TransactionType,
ClaimDescription(ClaimKey, TypeKey, ClaimantKey, Description, TransactionAmount,
ProcCode, Description) Debit/Credit, Balance, Payoff)
Claimants(ClaimantKey, Name, Address, City, State, AccountType(TypeID, Name, Description)
ZIP) TransactionTypes(Ttype, Name, Description)
ClaimTypes(TypeKey, Description) Disputes(AccntNumber, DisputeID, TrID, Date,
PolicyTypes(PolicyTypeKey, Name, Description) DisputeAmt, Explanation, Valid, ValidatorID)
•
PolicySales(PolicyKey, PolicyholderKey,
StartDate, TransactKey, Premium,
CoveragePeriod, CoverageLimit)
Policyholders(PolicyHolderKey, Name,
Address, City, State, ZIP) discountCandidates(custID,
Accounts(AcctNumber, AcctType, MemberID, address, policyKey, AcctNumber)
MemberType, TypeID, StartDate,
EndDate, InterestRate, CreditLimit)
Individuals(MemberID, FName, MI, LName,
SSN, Nationality, DoB,
LegalStatus, FullAddress, Phone,
PhoneType, Email)
Policyholders(PolicyHolderKey, Name, Individuals(MemberID, FName, MI, LName, SSN,
Address, City, State, ZIP) Nationality, DoB, LegalStatus, FullAddress, Phone,
PhoneType, Email)
4-
937528734’
X Y
Individuals(MemberID, FName, MI, LName, SSN,
Policyholders(PolicyHolderKey, Name, Nationality, DoB, LegalStatus, FullAddress, Phone,
Address, City, State, ZIP) PhoneType, Email)
Individuals(101, Stephen, C., Jones, 123-45-6789, US,
10/02/1983, citizen, “231 Cedar St. LA, CA 90005”, 661-266-9374,
landline, [email protected])
Individuals(102, Elizabeth, , McFarlane, 123-54-6789, US,
06/18/1978, citizen, “4157 Elm St. LA, CA 90005”, 213-266-9374,
mobile, [email protected])
Individuals(103, Liz, P., McFarlane-Gray, 123-92-2318, US,
06/18/1978, citizen, “231 Cedar St. LA, CA 90005”, 213-702-4343,
landline, [email protected])
Individuals(104, Lisa, M., Brady, 423-45-6209, US, 08/09/1975,
foreign-student, “231 Cedar St. LA, CA 90005”, 302-266-9374,
landline, [email protected])
Policyholders(3-764528104, Liz, P., McFarlane-Gray, 4157 Elm
St. LA, CA, 90005)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
BankTransactions(TransactionID (TID), Compute pairwise attribute
TransactionBeginTime(TBT), TransactionEndTime(TET), similarity and using a threshold
TransactionAmount(TA), Credit-Debit(CD), plus/minus an error, put similar
TransactionParty(TP), Transaction Description(TD), Balance(B), attributes in the same cluster
Payoff(P))
InsuranceTransactions(TransactionID (TID), TransactionDateTime For every subset of uncertain
(TDT), TransactionType(TT), Amount(A), TransactionDetails(TDT)) pairs create a mediated
schema
Med1({TID}, {TBT, TET, TDT} {TA+CD, A}, {TP, TD, TDT}, {TT},
{B}, {P})
Med2({TID}, {TBT}, {TET}, {TDT} {TA+CD, A}, {TP}, {TD}, {TDT},
{TT}, {B}, {P})
Med3({TID}, {TBT, TDT}, {TET, TDT} {TA+CD, A}, {TP}, {TD, TDT},
{TT}, {B}, {P})
…
•
• Med3({TID}, {TBT, TDT}, {TET, TDT} {TA+CD, A}, {TP}, {TD, TDT},
{TT}, {B}, {P}) is better than
• Med1({TID}, {TBT, TET, TDT} {TA+CD, A}, {TP, TD, TDT}, {TT}, {B},
{P}) with respect to BankTransactions
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• SELECT doctor, chronicDisease
FROM TreatsPatient T, HasChronicDisease H
WHERE T.Patient = H.Patient
S1.Treats(d, s)→TreatsPatient(d, p) AND HasChronicDisease(p,s)
S2.Discharges(d, p, c)→DischargesPatientFromClinic(d, p, c)
S3.Treats(d,s)→TreatsPatient(d,p) AND HasChronicDisease(p,s) AND
Doctors(d)
S4.Surgeons(d)→Surgeons(d)
•
•
•
•
•
•
•
•
•
•
•
Washington DC Disease Surveillance System (WADDS)
•
HL-7
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
value
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•