0% found this document useful (0 votes)
4 views4 pages

Oel 02

The document outlines a data analysis process involving health insurance data, including steps for data preprocessing, clustering, and predictive modeling. It utilizes various Python libraries for data manipulation and analysis, such as pandas, numpy, and sklearn. Key recommendations include targeting interventions for high-risk clusters and suggesting loyalty benefits for low-risk customers to enhance retention.

Uploaded by

shapparhay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
4 views4 pages

Oel 02

The document outlines a data analysis process involving health insurance data, including steps for data preprocessing, clustering, and predictive modeling. It utilizes various Python libraries for data manipulation and analysis, such as pandas, numpy, and sklearn. Key recommendations include targeting interventions for high-risk clusters and suggesting loyalty benefits for low-risk customers to enhance retention.

Uploaded by

shapparhay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 4
¥ Minal Fatima 21-CP-55 OEL-02 DWM 1 step 2: Tapert Necessary Librantes Anoort pandas a6 94 Snport nunpy 3 99 ‘ngort seaborn at ane ingot natplotlio.pyplot as plt ‘from sklearn.nedel selection ingort train_test split ‘fron Sklearn-ensenble inport RandonForestRegressor ‘from sklearn.cluster import Means from skleorn metrics Seport nean_squoredercor, aceuracy_score, classification report from sidearm Sapte seport Sinpleinputer ‘rom sklearn prearocessing import stareardscaler 4 step 2: Load the Dataset ‘from google. cola> inport files loaded = Files-uploae() # Upload the CSV Fite here ‘le_nane = List (uploaded. keys())12] macifed:13/12/2024- 100% done Saving health-Snsurance-cata.cev to health insurance data.cev 4 Step 3: Data Preprocessing 4 Inspect the dataset Printtatases Overview") prine(a-neaet)) (ustonerI0 Age Gender Region HealthCondition ClainsCoure ClainAnount \ ° 1 'S6 “ale ease tat ere : 2 6 Male north eth 7 ‘ow 2 246 Female “east cardiac to astp.ee : 4S “Wale north Garotae 2 a 560 Male ease "oN a InsurancePrentun Layaltyvears 1 da03. 5 Ba 2 a 2 4 andle nissing values {noutar ~ Sinoleteputer(strategy="nedsan') af[ Clatnsoount') = inputer. i, transfora(dé{[“Clasntnount“}]) ft = def “Chatnanount”J-quantte(@.25) 1 = dfl“Chatmanount”J-qvaneile(@.75) ign > B= at lower pound = qi - 2.5 * tar oper pound = 43 + 2.5 * Lar {F[‘Clainsoount') = np.wbere(dé[ ‘Clatntrount'} > upper_bound, upper_bound, df{Clatanount'}) 4 Encode eategorteal varfables [of = pd.get_cmnses(af, colunnse[ Gender’, ‘Region’, ‘Helthcondstion'], drop firstetrue) 1 Standardize nunerical features Scaler = Standardscaler() um_cols = (‘age', ‘ClainsCount', “Clatnanount", “InsurancePseniun’, “Loyaltyvears') [oun cols] = sealen.#it_teanstorn(et(un_cols]) Print "ata after preprocessing:") print af-heoed)) ‘Fe oata after preprocessing: ‘ustoneri9 Age’ Clainscount ClaieAnount InsurancePrentun 3 44 0.996292 1.340602 9.803815 9.452705 4 5 eiscone “elsssm6 “a.sarie 0.320586 loyaityvears Genéer_tale Region North Region_south Region Mest \ oe savesies9 trie False false False a “eleatsse Tre ve False False 4 -b328i6 True false False false eaLthcoraition Cardise HealthConaition psabetes \ e alse else 1 False False 2 Troe False 2 tre False 4 alse False eatthcorat tion Hypertension 1 False 2 False 2 False 4 False 1 step 4: Patter Discovery # clustersng Ineans = Kheans(n_clusterse2, randon_state-s2) af ‘Cluster’] = eneans.f8¢_predict(@F[nur_cols]) sns.painplot(a, huew'Clusser’, varscrom-cols) ple-show() 1 step 5: Predictive Modeling 4 Split aataset X= dF-drop({CustonerZ0", “Claindnount'], anise) y= df{ 'Cainawount"] A.rain, ALeest, y_tain, y_test ~ train_test splittK, ys test_size-e.3, randon_staters2) a 2 * ‘feature Seportances = pé.Sertes(nodel feature snportances_, SndexsX. colons) feature_inporeances.nlangest (10) plot{kings2arh") plt.title("reature Ieportances") pit.show() Feature Inportances ‘entnconeonabetes veathconton cardiac felon south conser Mate region North conmscount Levan ae ssurnceeium custer oo cl) eee eecomentatons) Frinetchterey highrise clusters sing ters for sarstedinerventons.”) Print ros on castors with hnish conditions coneriuting to Nghe clta.”) Print". Suggest Loyalty benesits for long-tern low-risk custoners to retain thet.") Print(é. Use predictive nodeling for proactive cost managenent by anticipating Nigh elaine.

You might also like