0% found this document useful (0 votes)
143 views

Manual Pentaho Data Integration Fundamentals Parte I

Uploaded by

clarisse
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
143 views

Manual Pentaho Data Integration Fundamentals Parte I

Uploaded by

clarisse
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 104
© pentaho Pentaho Data Integration Fundamentals Course Code DI1000 Version 5.4.0.1 Slide Version: 54A Student Guide Pentaho Data Integration Fundamentals (Course Code D11000 Pentaho Data Integration Fundamentals - Table of Contents Guided Demo 1 — Launch & Customize PDI.. Guided Demo 2 ~ Creating a ‘Hello World’ Transformation Exercise 1 ~ Generate Rows, Sequence, Select Values.. Guided Demo 3 Error Handling & Basic Logging. Guided Demo 4 ~ Saving a Transformation in the Repository ..... Guided Demo 5 - Combining Several Inputs into One Output Guided Demo 6 — Creating kettle. properties Variables. Exercise 2 ~ CSV Input to Multiple Text Output Using Switch/Case..... Exercise 2 Advanced ~ CSV Input to Multiple Text Output Using Switch/Cas Exercise 3 - Serializing Multiple Text Files Exercise 3 Advanced — Serializing Multi Exercise 4 ~ De-serializing a File Exercise 4 Advanced ~ De-Serializing Multiple Text Files. Guided Demo 7 ~ Connections & the Database Explorer. Exercise 5 - Reading & Writing to Database Tables... Exercise 5 Advanced ~ Reading & Writing to Database Table: Guided Demo 8 - Data Cleansing Exercise 6 Input with Parameters & Table Copy Wizard . Exercise 6 Advanced — Input with Parameters & Table Copy Wizard... Exercise 7 ~ Parallel Processing . Guided Demo 9 - Choosing Adequate Sample Size for ‘Get Fields’. Exercise 8 ~ Lookups & Data Formatting. Guided Demo 10 - Creating Summary Fields Using Group By Exercise 9 - Calculating & Aggregating Order Quantity Exercise 9 Advanced ~ Calculating & Aggregating Order Quantity . Exercise 10 — Loading JVM Data into a Table Exercise 10 Advanced — Loading JVM Data into a Tabi: Exercise 11 - Using the Pentaho Enterprise Repository... Guided Demo 11 - Scheduling & Monitoring... Guided Demo 12 — Detailed Logging throughout Execution Appendix — Course Slides... Copysigh © 2015 Pentaho Corporation, Al rademarts ace the prope of ther respective owners Course boks may na be produced or dstuted, In whale rn pat, without the pir wren persion of Pentho Ting vn pentaho com services ning or emi: relaingSpentaho.om Page |2 Pentaho Data tntegation Fundamentals Course Cade DIO Guided Demo 1 - Launch & Customize PDI Introduction Objectives Prerequisites In this guided demo, you launch Spoon, PDI’s graphical designer. Then, you Tearn how to customize some of its options and default behavior. In this guided demo, you will: * Launch Spoon '* Open Spoon’s ‘Options’ dialog '* Describe the common options and look & feel settings ‘* Toggle the welcome page ‘* Toggle the repository’s dialog at startup ‘* Change the grid settings ‘You must have Pentaho Data Integration (or Pentaho Business Analytics suite) installed and properly configured. Continued on next page Copyright © 2015 Pentaho Corporation. Al rademarks are the property oftheir respectve owners, ‘Course books may not be reproduced or distributed In whole or in pa, ulthou the pear written permission of Penta Training. ‘sa pentaho.com/servies/vaiiag or eral Page |2 ining ®pentano com Pentaho Daa Inegraton Fundamentals ‘Course Code D11000 Guided Demo 1— Launch & Customize PDI, Continued demo In this section of the guided demo, you launch Spoon. Step ‘Action 1 To launch Spoon, PDI’s graphical designer interface, from the Windows Start button: Start > All Programs > Pentaho Enterprise Edition > Design Tools > Data Integration © TIP: Create a shortcut to Spoon on the Desktop to easily start the interface throughout this course. Read the Tip, and then, click the [Close] button. The PDI application with the ‘Welcome!’ screen Is displayed. Scroll the ‘Welcome!’ screen to familiarize yourself with its contents. Pentaho Data integration an 4 ' Get the Most From Pentaho In the menu bar, click Tools | Options... Continued on next page Copyright © 2015 Pentaho Corporation al trademarks are the property of thet respective owners ‘Course books may not be reproduced or dstrbutd, in whole orn ‘without the prior wilten prison of Pentaho Trail. raining®pentano.com eat pentaho com/servesytaning Page| Pentaho Data intgrtion Fundamentals Course Code D11000 Guided Demo 1 — Launch & Customize PDI, Continued Guided Demo, Step Action continued 5 | © Uncheck the ‘Show tips at startup?’ checkbox. * Check the ‘Show repository dialog at startup?’ checkbox. * Click the ‘Look & Feel’ tab. few rau iunmtadncinegeguto et senna deena operations i saute Chyna een ae tonnage rife | ee nao Pearse Geesinegemede— sseaneptonge Cm _) Cent} €D NOTE: We will explain repositories in future modules. It is in those modules where you begin saving objects and configurations to a database using the repository. « cick the eat button |for the ‘fon for notes’ property. The ‘Font’ dialog is displayed, change the font to ‘Tahoma’ and then click the [0K] button. # Change the ‘Grid size’ property to 16. * Click the [Ok] button to accept your changes and close the ‘Kettle options’ dialog. The ‘info’ dialog will display. Continued on next page Copytght © 20:5 Pentaho Corporation. Al trademarks ae the property oftheir respectv owness. Course books may not be reproduced or ditibuted In whole on par, without the ror writen permission of Pentaho Tning ‘wt pestaho com seeies/uaning or erat: anioa@pentahocom Page |4 Pentaho Data Integration Fundamentals Course Code DIDO Guided Demo 1 — Launch & Customize PDI, Continued Guided Demo, Step continued 6 (cont) Use tect tuton ee Dichangese ort ——ef sae emntanig B exam ‘ee D NOTE: Feel free to use any font or grid size you like. However, 32 is the recommended setting for the grid size because it makes aligning steps on the canvas easy. 7 | Click the [OK] button to close the ‘Info’ dialog, and then, close Spoon. Please restart the application for allanguage and Look&Feel changes tortake effect! Click the [OK] button, —————> Continued on next page ‘Copyright © 2015 Pentaho Corporation. ll rademarks are the property of thelr respective owners. {Course books may not be reproduced or distributed, ln wile ot in pat, without the prior writen permission of Pentaho Training: ‘ou pentaho.com/sendces/vainlng or emi: [email protected] Page |5 Penta Data integration Fundamentals ‘Course Code DI1000 Guided Demo 1 - Launch & Customize PDI, Continued note Although restarting Spoon after changing Look & Feel options is the best habit to have, not all options require a restart for the changes to take effect. Verify User in this section of the guided demo, you will verify the user interface and other Interface changes to the Spoon options took effect. Changes Step Action 1 Launch Spoon, Notice the following: * The ‘Repository Connection’ dialog is displayed. Click [Cancel]. * The ‘Spoon tips... dialog is not displayed. 2 | Tocreate anew Transformation, in the menubar, click File | New | Transformation Tevsforration CRLAUTN Open URL Datbste Connection ce cm | ta cnt | Close All SHOFT-CTRLW. | soe cms ae | terttomnamace ct | ene + [we Continued on next page Copyright © 2015 Penteho Corporation Altrademark are the property oftheir respective owners. Courtebooks may not be reproduced o distributed, In whole rin pr, without the por written permission of Pentaho Training vy. pentsho com/senvces/tvaning or ema [email protected] Page |6 Pentaho Data Integration Fundamentals Course Code D000 Guided Demo 1 — Launch & Customize PDI, Continued Verify User Interface Changes, continued OTP End of Guided Demonstrat jon Step Action 3. | Right-click anywhere on the empty canvas, and then click New note in the context menu that appears. a Click in the ‘Note’ section of the ‘Notes’ dialog, type an informational message about your transformation and then click the [OK] button. A new note with the text you entered is displayed on the canvas. Turn the Spoon tips back on so that you can read helpful tips every time you start Spoon in this course. Changing the grid size to 32 will also allow for easy step alignment to the canvas as you create transformations. Congratulations! You have completed this guided demonstration. ‘Copyright © 2015 Pentaho Corporation, Alltademarks ate the propery of thee respectve owners. ‘Course books may not be reproduced or distributed in whole or In par, without the prior written permission of Pentaho Training. wate com/sevce ling or ema: lagen som Page |7 Penlabo Data Integration Fundamentals Course Code DIIOOO Guided Demo 2- Introduction Creating a ‘Hello World’ Transformation jided demo, we will use Spoon to create a new transformation with ing “Hello World”. Although this is probably not a task you will be asked to do in the real world, the concepts learned in this guided demo help to build the foundation necessary for creating any transformation, Objectives In this guided demonstration you wil Learn to create a new transformation. * Add steps and hops. * Configure the Generate rows step. ‘Action Launch Spoon. Creating the Transformati [Step on the Steps, 1 and Executing | — To create a new transformation, in the toolbar, click the ‘New file’ icon, and then click Transformation. To create @ new step in the transformation: * In the ‘Design and View Tabs’ panel on the left, click the Design tab. A tree of step categories is displayed. * Expand the Input node. * Drag-and-drop the Generate Rows step from the Input step category to the canvas. A new step is created in the transformation. in the canvas, double-click on the Generate Rows step. The ‘Generate Rows’ dialog is displayed. Continued on next page Copyright © 2015 Pentaho Corporation. l trademarks are the property oftheir respective owners Course books may not be reproduced a dstibued, in whol on part, without the prior written permision of Pentaho Teinng. aw. pentan con/senices/vaning or eval: [email protected] Page |8 Pentaho Data Integration Fundamentals Course Cade D000 Guided Demo 2 — Creating a ‘Hello World’ Transformation, Continued Creating the Transformation [Step ‘Action the Steps, and 5 | Enter values in the properties of the ‘Generate Rows’ dialog as, Executing, follows: continued Property Name Value Step name Create 15 rows Limit 15 6 | Fill in the ‘Fields’ grid if the ‘Generate Rows’ dialog as follows: Column Name Value Name Greeting Message Type String Value Hello, World! IMPORTANT: Use exact spelling and case if you choose to type values for columns in a grid that have drop-down lists. 7 | Before we close this dialog and continue creating the transformation, let’s make certain the step generates the data we expect. * Click on the [Preview] button. The ‘Enter preview size’ dialog is displayed. '* In the ‘Enter preview size’ dialog, click the [OK] button. * Verify 15 rows of data with the message you entered is displayed, and then click the [OK] button to close the ‘Examine preview data’ dialog. * Click the [OK] button to close the ‘Generate Rows’ dialog. ® TIP: Previewing data and testing steps along the way can really help to minimize errors and trouble-shooting time later in the transformation creation process. 8 | In the ‘Design and View Tabs’ panel, expand the Flow branch, and then drag-and-drop the Dummy (do nothing) step to the canvas, to. the right of the existing step. Continued on next page Copyght © 2015 Pentaho Corporation. Al eademart are the property of he especie owners. ‘course books mayeot be reproduced or distr nwhele orn par wltou he or ren permission of Pentaho Teng. nlseveeshlning or ema: wane @pentaho com Page |9 ‘Pentaho Data Integration Fundamentals Course Code BI1000 Guided Demo 2 = Creating a ‘Hello World’ Transformation, Continued Creating the Transformation [Step ‘Action the Steps, and 9 | Now you will create a hop between the two steps. Executing, continued ‘ Hover the mouse pointer over the Generate Rows step. A small toolbar will appear under the step. ‘Click on the output connector icon. * Click on the Dummy (do nothing) step. A hop is created between the two steps. @] Create 15 rows Dummy (do nothing) TIP: Another method to create a hop between step: * Click and hold on the source step using the middle mouse button. * Point to the destination step, * Release the middle mouse button. 10 | Right-click anywhere on the empty canvas, and then click New note’ in the context menu that appears. Continued on next page opwteht © 2035 Pentaho Corporation. Al trademarks ar he property of their respective owners. Course books may not be reproduced or estibuted in whole orn par, without the prior writen permission of Pentaho Taning. ciahocon/senices/vaning or eral: taning@aenhocom Page |10 Pentaho Data Integra Fundamentals Course Code DIIOOO Exercise 1 — Generate Rows, Sequence, Select Values Create the In the first part of this exercise, you will create a transformation, and then transformation separately add and configure each step and connecting hops. You will preview each step after it is added and configured to ensure the data previewed is what is expected. This best practice technique helps to prevent configuration errors that would otherwise be difficult to troubleshoot if the entire transformation was created, and configured prior to preview. Please concentrate on learning the techniques for creating the transformation, rather than the logic of the steps being used. You will learn the details of these and many other steps throughout the remainder of this course. Objectives In this exercise you wi * Learn to create a new transformation. ‘© Add steps and hops. ‘* Configure and preview steps + Execute the transformation. Create the To create the transformation: Transformation and add the Step Action ‘Generate Rows 1__| In Spoon, click File | New | Transformation. step 2 | Toadd the first step, expand the Input category in the Design tab, and then drag the Generate Rows step onto the canvas. 3 | Toopen the step properties dialog, in the canvas, double-click the Generate rows step. Continued on next page copyright © 2015 Pentaho Corporation. Alltrademarksare the property of thelr respective owness Course books may not be reproduced or dsibutd, in whale orn pat, without the prior written permission of Pentaho Training. une pengano com/servees/vaning or ema [email protected] Page |13 Pentaho Data Inegration Fundamentals ‘Course Code 011000 Exercise 1 — Generate Rows, Sequence, Select Values, Continued Create the Transformation and add the Generate Rows step, continued step ‘Action To configure the ‘Generate rows’ step, set the step’s properties as shown in the table below: Property Name Value Step name Generate 10 Rows Limit 10 To configure the Fields grid, add three rows configured as shown in the table below: Name Type Format. Value wantField String POI solves integration challenges. dontWantField integer |W a dontWantDateField | Date MM/dd/yyyy _| 05/21/1956. £0 NOTE: You must use the exact case and spelling as shown in the table above. To preview the data and confirm it is configured properly, for the ‘Generate Rows’, click [Preview], and then click [OK]. Verify the data generated is correct by comparing your data with the screenshot shown below: eae! soveritegscnhalenged POL sohes integration challenges. 1 osravasse otsohesinesation challenges. 2 osaiasss DLsobes integration chalenges 1 osausss Dl sohes integration challenges 1 osaiasss OL sohes integration challenges. 1 o5aiasss 1 1 1 POL solves integration challenges 57217956 POl solves integration challenges 05721956 Ol solves integration chalenges. osr219ss POI solves integration challenges 1 osraasss To close the ‘Examine preview data’ dialog, click [OK]. _ Continued on next page ‘Copyright © 2035 Pentaho Corperaton. ll redemers are the property oftheir cespectve owners. Course books may not be reproduced or distributed In whole orn par, without the prior witen permission of Pentaho Training mvon,pentaho con/senices/traning or eral tralning@oentehocom, Page |14 Pentaho Data rtegration Fundamentals Course Cade D100 Guided Demo 2 - Creating a ‘Hello World’ Transformation, Continued Creating the Step Action Transformation 11 | Type an informational message about your transformation and then the Steps, and click the (OK] button. A new note with the text you entered is Executing, displayed on the canvas. continued c}—_+-—_ef Create 15 rows Dummy (do nothing) Fiften "Hello Worlds tothe world 12 | To set the transformat n properties: In the menu bar, click Edit | Settings. Provide a name for the transformation, such as ‘HelloWorld’. * Optionally, enter a more detailed description in the ‘Extended description’ property. * Click the [OK] button. 13 Use the File | Save from the menu, save ico’ press CTRL-S to save the transformation. in the toolbar, or 14 | Use the mouse to drag a box that surrounds both steps and the hop. Both steps will be selected. 15 | Click the Preview icon ®in the Sub-toolbar, located immediately above the canvas. The ‘Transformation debug dialog’ dialog is displayed. Ty create aSrom dummy (nating) a i Guectauner | _Comgure | _cancet Continued on next page copirignt © 2015 Pentaho Corporation. Al rademaris ae the propery af ther respective owners. ‘course books may not be reproduced or distributed, In whole orn pat, without the prior written permission of Pentaho Trang ‘wanu.pentaho.com/sendces/tvaining of ema: alingdpentaho.com Page [a2 Pentaho Data Integration Fundamentals Guided Demo 2 — Creating a ‘Hello World’ Transformation, Continued Creating the Step ‘Action Transformation 16 | Click the [Quick Launch], and then click [Show]. The ‘Examine the Stepsand preview data’ dialog is displayed. Executing, continued Greeting Messag Hello, Werld! Hello, World! Hello, World! Hello, World! Hello, World! This dialog is the same dialog that was displayed when previewing the data from the Generate Rows step. This is because the Dummy (do nothing) step does not perform any logic, so the same data is expected. 17 | Click the [Close] button to close the ‘Examine preview data’ dialog. Solution Details The solution to this guided demonstration can be obtained using the details below: Location: C:\pentahotraining\Solutions\Guided Demonstrations Completed transformation: GD2_HelloWorld.ktr (Use File | Import from an XML file to import) End of Guided Congratulations! You have completed this guided demonstration. Demonstration Copytght © 2015 Pentaho Corporation. Al trademarks are the property f thei respective owners Course books may not be reproduced or estibuted in whole o par, without he plo writen permission of Pentaho Tehing. ‘a penahe com/sewvies/anig o eal: ening entaho com Page |12 Pentaho Data Incgration Fundarcenals Course Code D11000 Exercise 1 — Generate Rows, Sequence, Select Values, Continued Create the Transformation Step Action and add the To close the ‘Generate Rows’ dialog, click [OK]. Generate Rows step, continued 10 | To save the transformation, in the Spoon menu, click File | Save, and then provide transformation details as shown in the table below: Property Name Value Transformation | SelectValues name Description {enter'a description of your choosing) Extended (enter an extended description of your description choosing) Directory C:\pentahotraining\My Work\EX1 11 | To close the Transformation properties dialog and save the transformation, click [OK]. Create & To add and configure the ‘Add sequence’ step: Configure the AddSequence [Step ‘Action Step 1_| Tofind the second step to add to the transformation, in the Design tab, type ‘add seq’ in the search field. The Design tab's tree will automatically be filtered to display the steps that match the text entered as shown in the screenshot below: Biew 2 bese sum booed rest 3 2 | Toadd the step, drag the Add sequence step to the canvas, placing it to the right of the first step. Continued on next page Copyright © 2015 Pentaho Corporation. fll irademaris ae the property of ther respective owners. ‘Course books may nt duced or ditbutd, in whole orl pat, without the prior written permission of Pentaho Tinng. ‘eu petaho com/seevesevaning or ema waningepeataho.com Page |15 Pentbo Data tmegrtion Fundamentals Course Code DiL000 Exercise 1 - Generate Rows, Sequence, Select Values, Continued Create & Configure the Step | Action’ ‘Add Sequence 3_| Adda hop from the first step, to the second step. Stepyeantinued 4 | The ‘Add sequence’ step’s default configuration is exactly how we want ours configured, therefore, no configuration is necessary. To examine its configuration, double-click the step, notice its default configuration, and then click [OK]. 5 | To preview the Add Sequence step: Select the step. In the Spoon menu, click Action | Preview. # Click the [Quick Launch] button. 6 | Verify the preview data has all of the first step’s fields as well as the new field from the ‘Add sequence’ step, named ‘valuename’. ons of step: Ad sequence (10 rows) ++ [wanfeid Tonia] consanoarerits ]vatuenare] 1 POlsoWes integration lenges 1 O521/1956 lsohes integration chalenges 2 osaysse POlsoNes megzationentenges, 1 osaisss olsoes negation ealenges 2 osay9ss olsohes integration cntenges, 1 osyisss 1 PoIsoWes integration naenges Os 11956 Pol soWesimegratin naenges osainsss Polscivesimegration catenges osnsss PoIsowes iteration naenges os1956 Pl sotves integration cnstenges osi956 7_| To close the ‘Examine preview data’ dialog, click [Close]. 8 _| Save the transformation. Continued on next page Copyght ©2035 Pensho Corporation. Alteademarks are the propery ofthe respective owners. ‘Course books may not be reproduced or astributed in whale an par, without he por writen permission of Pentoho Trang aun pataho com servceatsning rem tlning@pentaho com Page |16 Pentaho Data Integration Pundomentals Course Code DI1000 Exercise 1 - Generate Rows, Sequence, Select Values, Continued Create & In this section of the exercise, you will add a ‘Select values’ step that will show Configure the —_ you how to copy a field in the stream, including its data, and add it to the ae Values stream as a new field. You will duplicate the ‘wantField’ field twice. step step ‘Action 1 | Tocreate the third step, from the Transform category of the Design tab, drag the Select values step onto the Canvas. 2 | Create a hop between the steps as shown in the table below: Source Step ‘Add sequence Select values a To open the step's properties dialog, double-click the step. ‘4 ~ | To configure the step’s properties, set them according to the table shown below: Property Name. Value Step Name Duplicate field using Select values step 5 | Toconfigure the fields to select and duplicate to new fields, in the Select & Alter tab, set the grid according to the table below: Fieldname Rename to valuename wantField wantField copyt wantField copy2 Continued on next page Copyright ©2015 Pentaho Corporation. ll rademars are the prope ofthe respective owners ‘course books may not be reproduce rested n whole rn par, without he por written persion of Pentaho Teg. "wine pentho.com/senrces/rning rer taning@pentsho com Page |17 Penta Data integration Fndamentals (Course Code D11000 Exercise 1 — Generate Rows, Sequence, Select Values, Continued Create & Configure the Select Values Step, continued Step Action 6 | Compare the step’s configuration with the screenshot below and make any necessary changes. : rer Seocone [STREETS = [radaae [Rewari] ce tose wane Eatnoppa pirat at wears am ly __j 4 ine pees One) —%_|_ ore] 7 [To close the step’s properties dialog, click [OK]. Continued on next page Copyright © 2015 Pentaho Corporation. Altrademark are the property of thel respective owners. Course books may not be reproduced o distributed, n whole o in part, without he peor written permision of Pentaho Teinng. ou nentaho.com/sendcestaining or eral: [email protected] Page |18 Pentaho Data Integration Fundamentals Course Coe 11000 Exercise 1 — Generate Rows, Sequence, Select Values, Continued Create & Configure the Step ‘Action Select Values 8 To preview the Select values step: Step, continued ‘Select the step. «In the Spoon menu, click Action | Preview. * Click the [Quick Launch] button. 9 | Verify the preview data has all of the first step’s fields as well as, the new field from the Add Sequence step, named valuename. 2 phonic tres {Folens lrg 3 tolabaianpace srg {Bienen eye 10__| To close the ‘Examine preview data’ dialog, click (Close). 11 __| Save the transformation. Solution Detalls The solution to this exercise can be obtained using the details below: Location: c:\pentahotraining\Solutions\Exercises Completed transformation: EXI_SelectValues.ktr (Use File | Import from an XML file to import) End of Exercise Congratulations! You have completed this exercise. Copyright ©2015 Pentaho Corporation. Altademari are the propery oftheir respective owners. Course books may rot be reproduced or distributed, In whole orn pat without the prior writen permission of Pentaho Training wine pentaho.com/serees/tsning or emali aning@oertaho om Page |19 Pentaho Data Integration Fundamentals ‘Course Code D11000, Guided Demo 3 - Error Handling & Basic Logging Introduction _This guided demonstration provides and introduction to finding and handling errors, as well as logging. Prerequisites You must have Pentaho Data Integration (or Pentaho Business Analytics suite) installed and properly configured. You must also have access to course files required (if any). Objectives After completing this guided demonstration, you will be able to: # Know when you have an error. Find the log file. # Show only Error messages. Continued on next page copyright © 2015 Pentaho Corporation. Allvasemars ae the property of thei respective owners course boks mayne be reproduced or dstbutd, In whol or npr, with the peer writen permission of Pentaho Training. ‘wn pentane con/rences/ tating or emai pentah com Page |20 Pentaho Data Integration Fundamentals Course Code D11000, Guided Demo 3 — Error Handling & Basic Logging, Continued Error Handling and Logging Step ‘Action 1__| Ifitis not already, open the HelloWorld transformation. 2__ | Double-click on the Generate rows step named, ‘Create 15 rows” 3 | Change the ‘Greeting Message’ from String to Integer. This will ‘cause the transformaion to error. Click {OX 4 _ | Run the transformation by clicking on the green arrow in the sub- toolbar. The icon that is the farthest to the left. = 5 _ | The launch dialogue will appear, click the launch button. —s 6 | Onthe canvas you will see that the ‘Create 15 rows’ step is now outlined in red. If you click on this icon it will show you the errors. Ye Create 1S rows Dummy (do nothing) gn ele WaTGP the world | Continued on next page Copyright © 2015 Pentaho Corpraton. Al rademarks are the property of tel respective owners. course books may not be reproduced ar dribtedn whole npart, without the pro written permislon of Pentaho Training. ‘wi ataho.com/servcestsning or em: traning @oentsho.com Page 2a Peotaho Data Integration Fundamentals Course Cade DILOOO Guided Demo 3 - Error Handling & Basic Logging, Continued Error Handling and Logging Step ‘Action coniinned 7 | Notice that in the ‘Step Metrics’ tab there is a large red line through the step that has the error. sults Qoosing iz Step Meine [2 Perormance Gopi LE Mees @revew data" 8 _| The best way to see the errors to look in the log file, To do this click on the Logging tab, Error lines are in red. Execution Results G Logaing \ i= step Metrics L* Performance Graph|[& Metrics a Preview data)" 2035/07/26 114501 - Spoon - 2015/07/26 11:4501 - Spoon - at orapentaho disrans. Trans preparetxecution(Trans}av 2015/07/26 1114501 - Spoon - at org pentano divi spoon rans WansGraph$28.unTrag 2015/07/26 114501 - Spoon - at javalang Thvead.un(Unknonn Source) 9 | Onthe logging tab you will see and icon on the far left. Click on this icon, a dialogue box will appear that will show you ONLY the error lines. This makes it much easier to find your errors. Execution Results G facet Histon (POSTERS —+on x 10 | Return to the Create 15 rows step and change the field type back to String. 11 Rerun the transformation. Continued on next page ‘Copyright © 2015 Pentaho Corporation All rademars are the property oftheir respective owners. ‘Course books may not be reproduced ar dstibuted, In whole or Ia pat, without the prior written permission of Pentaho Training. wav pentaho.com/services/tranlg or eal traning @pentaho com Page [22 Pentaho Data tntepraton Fundamentals ‘Course Code D11000 Guided Demo 3 — Error Handling & Basic Logging, Continued om ‘When troubleshooting a transformation, it can be very helpful to know exactly where fields on a step originate from. To do this: Right-click the step you wish to learn more about, and then select Show input fields, or Show output fields. You will be presented with a grid that shows the step each field originated from. Step name: Bury (do nothing) Folds + | fetaname Type [vengin [recon [step engin [storage [Mase [cul Greeting Mesage — Sving Create 1S10¥s normal Solution Details The solution to this guided demonstration can be obtained using the details below: Location: C:\pentahotraining\Solutions\Guided Demos Completed transformation: GD3_HelloWorldWithError.ktr (Use File | Import from an XML file to import) End of Exercise Congratulations! You have completed this guided demonstration. ‘course books may not Copyright © 2015 Pentaho Corporation. al rademark are the property of thei rspecte owners. rprodoced or dtd, n whole rn par, without he por witen permision of Pentaho Tein eu pentaho.con/serces/aning or ena: [email protected] Page | 23 Pentabo Data Integration Fundamentals Course Cede D11000 Guided Demo 4 - Saving a Transformation in the Repository Introduction Objectives Prerequisites Repositories note In this guided demonstration, you will get a brief introduction to the repository and learn how to save an existing transformation into it. You can then use the repository throughout this course to save and organize your transformations and jobs. After completing this guided demonstration, you will be able to: ‘* Connect to an existing repository. ‘* Save transformations and jobs into an e» ting repository. Access to a student environment where a repository has already been created for you. Repositories are objects that provide a location to save transformations, jobs, and their configurations (metadata). PDI allows users to create and use three types of repositories, which will be looked at in detail later in this course. By default, no repository is used and transformations and jobs are stored on the local file system as individual *.ktr and *.kjb files in XML format. The Pentaho hosted student environments already have a repository created. Continued on next page Copyright © 2015 Pentaho Corporation. Al rademarks are the propery of thelrespectve owners. Course books may not be reproduced or distributed n whole or In part, without the peor written permission of Pentaho Tesinng. itusnntaho.com/serees/tsning or ema traning @pentahe com Page | 24 Pertabo Data Integration Fundamentals Course Cade DI1000, Guided Demo 4 — Saving a Transformation in the Repository, Continued OpenaRecent In this section of the guided demonstration, you will open a recent Transformation — transformation and connect to an existing repository. and Connect to a Repository Step Action 1 | To open a recent transformation, in the Spoon menubar, click File | Open Recent, and then, click the HelloWorld transformation. 2 | To open the ‘Repository connection’ dialog, in the Spoon menubar, click Tools | Repository | Connect... 3 | To connect to the PDI_TRN repository: ‘In the ‘Repository connection’ dialog, select PDI_TRN. ‘In the User Name field, enter ‘admin’. ‘In the password field, enter ‘password’. * Click [OK]. CAUTION —_—_The fields are case sensitive. The User Name field populated for you. cely already be copyright © 2015 Pentaho Corporation. A trademarks are the property of thelr respective owners Course books may not be reproduced or dstibute, in whole orn par, without the por written permission of Pentaho Tealang. ‘wou. pentaho conserves along cr eral alning®@pentahocom Page |25 Pentaho Dats Integration Fundamentals Course Code 11000, Continued on next page Guided Demo 4 - Saving a Transformation in the Repository, Continued The Close Files If you receive a dialog as shown below, and you want to keep your existing Dialog files open when Spoon starts, click [No]. NE] @ vesesersuestonarar | eo ce Transformation When you are connected to a repository, the transformation (or Job) Properties properties dialog allows you to save the object to the repository by Dialogwhen on the folder icon [©] for the Directory property. This is illustrated in the Connected to screenshot below: Repository Oreton: Thomann ql cratedty = Chetedat Latmodtied by = Lastmedhedat § ok When you are not connected to a repository, the Transformation (or Job) properties dialog only allows viewing where the object is currently saved {if it has been saved previously). om Even while you are connected to the repository, you can save an object to your environment’s file system by using the File | Save As... options in the Spoon menubar. Continued on next page Copyright © 2015 Pentaho Corporation. Altrademarks are the property of thee respective owners Coure books may not be reproduced or distributed In whole orn pat, without the prior written permission of Pentaho Training. "Mai pentane com/serdeestsning or eral: [email protected] Page |26 Pentaho Data Integration Fundamentals Course Code DILOOD Guided Demo 4 - Saving a Transformation in the Repository, Continued Save the Transformation inthe Repository This final portion of the guided demonstration has you save the open transformat into the repository that you are now connected to. Step Action To open the Transformation properties dialog, press CTRL", or double-click an empty area on the Canvas. To save the transformation to the repository: «In the Transformation dialog, click the folder icon © for the Directory property. ‘In the ‘Directory Selection dialog’, right-click ‘public’, and then click ‘New sub-directory’. Name it ‘PDI_Trn_Objects’. The new folder will be selected, © Click [OK]. fen) (ancl ‘Save the transformation. To verify the transformation is saved in the repository: ‘» Press CTRL-T to open the Transformation properties dialog. * Examine the Directory property and see it has the repository folder structure as you defined. 5__| To close the Transformation dialog, click [OK]. Continued on next page ‘Copyright © 2015 Pentaho Corporation. Alltademarks ar the property of thelr respective owners Course books may not be reproduced: stibted, nate rin part wthout he rr wien pemison of Penta Tein varuaehahecon/srvses/vaing or ea ahngBoentahe con Page |27 Pentaho Data Integration Fundamentals ‘Course Cade D11000 Guided Demo 4 - Saving a Transformation in the Repository, Continued canore ‘Throughout this course, please feel free to create new folders and organize ‘your transformations and jobs in the repository or file system as you like. ‘The save locations and names in this guide are only suggestions. End of Guided Congratulations! You have completed this guided demonstration. Now you Demonstration can use the repository to save your work throughout this course. opyight © 2015 Pentaho Corprstion. ll ademars ae the propery of their rezpetve owners. Course books may not be reproduced or estrbuted, n wholoor in pan, without the prior witen permission of Penta Training. wiou.geataho.com/servicestaning or eral ralning@peetcho com Page |28 Pentaho Data Integration Fundamentals ‘Course Cade D11000 Guided Demo 5 — Combining Several Inputs into One Output Introduction Prerequisites Objectives In this guided demo, you will create a transformation that will read multiple text files using a regular expression. Then, it will add the system date/time and transformation modified date/time to the stream. Finally, the entire stream will output to one delimited text file (CSV). To complete this exercise, you need access to the input files that reside on the student environment for this course. After completing this guided demo, you will be able to: * Configure a ‘Text file input’ step to read multiple text files based on a regular expression. ‘* Exclude a specific filename to prevent it from being included as an input. # Add various system and environment related data to the stream. * Create a single output file that is delimited (CSV). Continued on next page ‘Copyright © 2015 Pentaho Corporation. Alltrademarks are the property oftheir respective owners. Course books may ot be reproduced 0” dstlbuted, In whele orn ar, without the por written permision of Pentaho Talning. ‘wiv pentaho.com/servcestaning or emi: Waning @pentaho.com Page |29 Pentaho Data Inepaton Fundamentals Course Code D11000 Guided Demo 5 — Combining Several Inputs into One Output, Continued Create the Transformation [Step ‘Ration 1 | Tocreate the new transformation, in the Spoon menubar, click File L New | Transformation. 2 | To open the Transformation properties dialog, in the Canvas, double-click on an empty area. 3 _| Set the transformation properties for the Transformation tab according to the table below: Property Name Value Transformation name_| MultiinputOneOutput Description ‘Add a description of your choice. Directory /public/PDI_Trn_Objects NOTE: This is in the repository. 4 [To close the Transformation dialog, click [OK]. Create & The first step in this transformation will read multiple text files from an i Configure the directory. It will also use regular expressions to determine which files to Text File Input include and which one to exclude. Step Step ‘Action 1 | To create the first step, from the Input category of the Design tab, drag the Text file input step onto the Canvas. 2 [To open the step’s properties dialog, double-click the step. Continued on next page opjrht © 2015 Peto Corporation. Allrasemars ae the property of thei reptaive owners ‘ours bocks may note reproduced or dstibte,n whole on par witht the pir wien permision of Pentaho Tell. ‘pentane con/renes/taling or ena iraning@pentah com Page |30 Pentaho Data tntepati Fundamentals Course Cose DII00 Guided Demo 5 — Combining Several Inputs into One Output, Continued Create & Configure the Text File Input Step, continued Step Action 3 | To configure the step’s properties, set them according to the table shown below: Property Name Value Step Name Text File input from Multiple Files File or directory C\\pentahotraining\DataFiles\Input\Mult ilnputOneOutput IMPORTANT: Type in the path. Do not use the Browse button, Browse is used to point to a specific file. Regular Expression _| *\.bet Exclude Regular Cancelled_Orders_Summary.txt Expression a To add the file and expression configuration to the Selected files grid, click [Add]. It will be added to the grid as shown in the screenshot below. ‘iene eying) en Verify the correct files will be read and the expressions entered are correct by clicking the [Get filenames...] button. €O NOTE: Notice how the Cancelled_Orders_Summary.txt file is not listed. The exclusion expression prevents it from being included. Copyright © 2085 Pentaho Corporation. All rademarks are the property of thet rexpectve owners ‘Course books may nt be reproduced or estibuted in whole or In par, without the prior written permission of Pentaho Teasing. uw. pentaho con/sericestaling or eral walnlng®pentaho.com Page |31 Pent Dats Iniegration Fundamentals Course Coge BIIO00 Continued on next page Guided Demo 5 — Combining Several Inputs into One Output, Continued Create & Configurethe [Step ‘Action Text File Input 6 _| Toclose the ‘Files read” dialog, click [Close]. Step, continued 7 | Toconfigure the fields for this step: * Click the Fields tab. Click the [Get Fields] button. # At the ‘Sample size’ dialog, click [OK]. * Atthe ‘Scan results’ dialog, click [Close]. ® _ | Verify your fields to the screenshot shown belo 2 Name Type Format Peston Length 1 ordernumber Integer * 1 2 oxderdate Dae yy Mad 3 thippeddote Date oe MMe © sate Sting 6 5 eustomernumber_—_ Integer * 5 6 ——oxdetinenumber Integer . 18 7 producteoge Suing rs 8 quantyordered Integer * 6 9 pricessch Nomber a 8 D NOTE: The ‘Text file input’ step must always be configured to read all of the fields in the file. You cannot choose a subset of the fields in the file and have it successfully read the data Continued on next page Copyright© 2015 Pentaho Corporation. All rademark ce the property oftheir respective owners Course boots may not be reproduced ar distributed, In whole on pat, without the prior written permision of Pentaho Tel ‘uu entaho.com/senees/valing or email: tainingtpentaho.com Page |32 Pentaho Daa Integration Fundamentals Course Code 11000 Guided Demo 5 - Combining Several Inputs into One Output, Continued Create & Configure the Text File Input Step, continued Step Action 9._| Toverify the step is configured properly, preview the data by clicking [Preview], and then, click [OK]. Compare your data with the same shown: Examine preview data ey Seon Rows of step: Text file Input from Multiple Files (101 rows) shippeddate status Canceled Canceled 161 Canceled 161 2000-01-10 2000-01-12 Cancelled 381 2000-01-10 2000-01-12 Disputed 381 2000-01-12 ge Disputed 3 10__| To close the ‘Examine preview data’ dialog, click [Close]. 11 _| To close the step’s properties dialog, click [OK]. 12_| Save the transformation. Continued on next page Copyright © 2015 Pentaho Corporation. Altrademarks are the propery af ther respective owners. Course books may net be reproduced or distributed, In whole a apart, without the prior writen permission of Pentaho rs pontaho.com/sen/ces/taning or eal talningepentaho.com Page |33 Pentaho Data Integration Fundamentals Guided Demo 5 — Combining Several Inputs into One Output, Continued Create & Configure the Get System Info Step Step Action 1 | Touse the search box and find the next step to add to your transformation, from the Design tab, enter the following text into the search box: “get sys” (Bien {2 deg se Geo] > Eee iri 2 | To create the first step, from the Input category of the Design tab, drag the Get System Info step onto the Canvas 3. | Create a hop between the steps as shown in the table below: ‘Course Code B11000 Source Step Destination Step Text file input Get System Info 4 _| Toopen the step’s properties dialog, double-click the step. 5 | To configure the Fields grid, add two rows according to the table below: Name Type row_output_time | System date (variable) tr_modified_time | Date when the transformation was modified last ‘The configured dialog should look like the screenshot below: Seonume Gapienbid nenhene tacemasonya 6 _| To close the step’s properties dialog, click [OK]. Copyright © 2015 Pentaho Cxporation. All rademark re the property oftheir respectve owners. Course books may not be reproduced or distrted in whole or in part, without the peor writen permission of Petabo Training. rw pentaho.com/services/eaning or emai tralning@pentaho com Page [34 Pentaho Duta Integration Fundamentals Course Code DI1000 7__[Save the transformation, Continued on next page Guided Demo 5- Com Continued Create & Configure the Step ‘Action Text File Output 1 | Tocreate the first step, from the Output category of the Design tab, ‘Step. drag the Text file output step onto the Canvas. 2 | Create a hop between the steps as shown in the table below: ‘Source Step, Destination Step Get System Info Text file output 3__| Toopen the step’s properties dialog, double-click the step, 4 | Toconfigure the Filename property, in the File tab, set it according to the table shown below: Property Name Value Filename. C:\pentahotraining\DataFiles\Output\Output _Allstatuses £Q NOTE: Do not add a file extension in the Filename property. There is another step Extension property allows you to set the extension. 5 _|To configure the fields to include in the output file: * Click the Fields tab. * Click the [Get Fields] button. 7_[To close the step’s properties dialog, click [OK] 8 Save the transformation. Continued on next page ‘Copyright © 2015 Pentaho Corporation Al trademarks are the property of thelr respective owners ‘course books may not be reproduced or stots, n woe orn pat without the prior written permission of Pentaho Trang. pentaho com/sences/rsining or emai aining@penta com Page |35 Pentabo Data Invegrtion Fundamentals Course Code DIN000 Guided Demo 5 — Combining Several Inputs into One Output, Continued Execute the Step Action ‘Transformation 1__| Torun the transformation, press F9, and then, click [Launch).. 2 | There should be no errors and the Step Metrics tab in the Execution Results pane should look similar to the screenshot. Tether Maile Geynem ne £2 NOTE: Notice the Output field contains data. This is indicates the number of lines output to the text files that it created. 3 | To verify the text file was created, navigate to the folder shown below and verify the creation of the file as show in in the example screenshot: CApentahotraining\DataFiles\Output Name Date modified Type 9/26/2014 8:46 AM csv File | 4 | Open the text files and notice how orders from all of the input files are included. Example of Output_AllStatuses.csv €D NOTE: In the example there are no multiple trailing spaces at the end of the values. This is because the Minimum width button was used in the Text file output step’s Fields tab. Continued on next page Copyright © 2085 Pentaho Corporation. ll trademarks are the property oftheir respective owners Course boks may netbe reproduced or ditibuted in whole orn pa, without te pr writen permition af Pentsh Tesning ‘wae penta com/serdees/tning orem ining@pentaho com Page | 36 Pentaho Data Integration Fundamentals Course Code D11000 Guided Demo 5 - Combining Several Inputs into One Output, Continued i Solution Detalls The solution to this guided demonstration can be obtained using the details below: Location: C: \pentahotraining\Solutions\Guided Demos Completed transformation: GD5_MultiInputOneoutput.kte (Use File | Import from an XML file to import) Text file output: Output_Al1statuses.txt ee ee eee EndofGulded Congratulations! You have completed this guided demonstration. Demonstration Copyright © 2015 Pentaho Corporation. Al tradematks ar the property ofthelrespectve owners. Course books may not be reproduced or dstributed, in whole orn par, without the prior written permission of Pentaho Tailng- ra.pentaho.com/services/trening or emai tralning®oentaho.cmn Page [37 Pentaho Data Iteration Fundamentals ‘Coure Cee D000 Guided Demo 6 — Creating kettle.properties Variables Introduction _It’s important in PD! to create transformations that will run in a variety of environment. That can easily be ported from Development to Test to. Production. This guided demo shows how to create flexible transformations by using ‘kettle.properties’ variables. Prerequisites You must have Pentaho Data Integration (or Pentaho Business Analytics suite) installed and properly configured. You must also have access to course files required (if any). Objectives After completing this guided demonstration, you will be able to: Find and edit the kettle.properties file. © Create 2 new variables DIR_INPUT and DIR_OUTPUT. Use these new variables in a transformation to read and write files. Model + | Read OrderData from CSY ‘te Order Data Continued on next page Copyright © 2035 Pentaho Corporation. All rademors oe the propery oftheir respective owners. ‘Course books may not be repreduced or distributed, n whole or In part, without the prior writen permission of Pentaho Training. supentaho comn/sevces/ traning or eral tralning@aenteho com Page |38 Pentaho Data Integration Fundamentals ‘Couree Cade D11000 Guided Demo 6 — Creating kettle.properties Variables, Continued Creating Variables for File Paths ‘Step ‘Action, 1_| Create a new transformation 2__| Save the transformation as KettleVariables. 3 _|To edit the kettle properties file, in the menu, click on Edit [dit the kettle.properties file, View Acion Tools Hel Undo: st wale Le D Recorneteainte cme cy mx a een cmc +) compte | Paste cmv Seapiot Camas cmmvatrt | et sc | Sect crn Seach Met date. cme | Sat Srronmert Vie cma Show Arguments ar | setings. cmt 4 _ |The kettle.properties file will open, scroll to the bottom of file and right click on the very last line. A list of grid options will appear. [Choose the second item on the list Insert after this row. A blank line will appear. Create two blank lines. 5 |In the empty lines enter these Variables Variable name Value DIR_INPUT jentahotraining\DataFiles\Input DIR_OUTPUT, jentahotraining\DataFiles\Output 6 [To close the keitle properties file, click [OK]. 7__ [Drag the CSV file input to the canvas. Continued on next page copyright © 2015 Pentaho Corporation. Alltrademaks are the property of thei respeetve owners ‘Course books may not be reproduced. tribe in whole or in par, without the prior writen permision of Pentaho Trains. viv nenaho comdsenices/aningor ema: [email protected] Page |39 Pentaho Data Integration Fundamentals Course Code DIIO00 Guided Demo 6 — Creating kettle.properties Variables, Continued ‘Action Double click on the step to open it. In the Step Name field type Read Order Data from CSV. Fill in the step as shown below using the new variable DIR_INPUT. Stepname “Reed Order Dita fom GV Filename “S{OIRINPUTAOrder Fiecsy Deliiter - Q NOTE: Parameters can be used in any field in Spoon that has a red and gray diamond. © Using CTRL-Space in the field will open a list of the current parameters. ‘Make sure that the Separator is set to semicolon (). Enable Header because there is one line of header rows in the file. Creating Variables for Step File Paths 8 9 10 it (Click the GetFields tab and click Get Fields to retrieve the input fields from your source file. A dialogue box will pop-up asking for fa sample size (number of rows). Enter 300. 12 Click Preview to verify that your file is being read correctly. You can change the number of rows to preview. Click [OK] to exit the step properties dialog box. 13 [Open the Output Category and drag a Text File Output step to your transformation, Continued on next page Copyright © 2015 Pentaho Corporation. Al trademark re the property of thee respective owners Course books may not be reproduced o distributed a whole orn part, without the prior written permission of Pentaho Training. ‘mg atahe com/senices/zlnng or ema: aining@oentahosom Page | 40 Pentaho Data negation Fundamentals Cove Code DI1000 Guided Demo 6 — Creating kettle.properties Variables, Continued Creating Variables for Step Action File Paths 14 |Create the Hop between the two steps o}—_—_>—af Ren Order Oaatrom CSV Wit Oder 15 [Open the Text File Output step In the Step Name field type Write Order Data 16 | Open the Text File Output step, in the File tab and fill in the output file name with the new variable as shown below. Step name 17_| Leave the Content tab as is. Note the delimiter will be a semicolon. 18 | Click on the Fields tab on the bottom of the page click the Get Fields button. 19 _| Click [Ok] to close the dialog. 20_| Run the transformation by clicking on the P icon. Cor ued on next page Copyright © 2085 Pentaho Corporation. Atodemaris rete propery oftheir cespective owners. Course books may not be reprosueed a stbtee, in whol orn part without the prior writen permission of Penta Training. ea neataho conserves aning o eral: [email protected] Page |41 Pentaho Data ltegstion Fundamentals Course Code D11000 Guided Demo 6 — Creating kettle.properties Variables, Continued Creating Variables for Step ‘Action Gla pathe 21. | Notice the green check marks on the steps, this means the continued transformation has run successfully, You can see on the Step Metrics tab thata file was created. “ame ied Winer ent [apa Vand eal 22_| Check to see if the OutputTest.txt file has been created at: C:\pentahotraining\ataFiles\Output Solution Details The solution to this exercise can be obtained using the details below: Location: C:\pentahotraining\Solutions\Guided Demos Completed transformation: GD6_KettleParameters.ktr (Use File | Import from an XML file to import) Endof Guided Congratulations! You have completed this guided demonstration. Demonstration ‘Copyright © 2015 Pentaho Cocporaton. Al rademarks are the property oftheir respective owners ‘Course books may not be reproduced or distributed, In whole a in part, without the pir writen permission of Pentaho Training ‘wie entaho com/servcesraning rem ainkng@pentaho com Page [42 Pentaho Data Integration Fundamentals Exercise 2— Introduction note Prerequisites Objectives ‘Course Code D11000, CSV Input to Multiple Text Output Using Switch/Case In this exercise, you will create a transformation that reads a CSV file containing order and country of origin data. Then, it will send incoming data for specific country’s orders to text files that it creates. Previously, you used kettle.properties parameters. Here, you will use transformation parameters; giving you experience using both types. For those looking for more of a challenge, try the advanced version of this, exercise. It is the same exercise, but without the detailed guidance. You will find it in this workbook immediately following this exercise, ‘You must have Pentaho Data Integration (or Pentaho Business Analytics ite) installed and properly configured. You must also have access to course files required (if any) After comple ig this exercise, you will be able to: * Create and use transformation parameters to define locations for input, and output folder locations. Create and configure a CSV file input step. © Create and configure a Switch / Case step that will send data to specifi steps depending on the data contained in the incoming data stream. Create and configure a Text file output step that will create a new text file Create hops connected from a Switch / Case step that are configured to specific case values. Continued on next page Copyright © 2015 Pentaho Corporation. Al trademaris are the property af ther respective owners. Coure books may not be repreduced or distributed, In whole ar In part, without te pia writen permission of Pentaho Training, ‘wiv pentaho.com/serdees/aning o eel [email protected] Page |43 Pentato Data Integration Fundamentals Course Code DI1000, Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create the Transformation [Step Retion 1 | To create the new transformation, in the Spoon menubar, click File | New | Transformation. 2 | Toopen the Transformation properties dialog, in the Canvas, double- click on an empty area 3 _ | Set the transformation properties for the Transformation tab according to the table below: Property Name Value Transformation name __| CsvinputTextOutput Description ‘Add a description of your choice. Directory Zpublic/PD|_Tr_Objects £D NOTE: This is in the repository. 4 | Create two new parameters using the Parameters tab according to the table below: Parameter Name Default Value KTR_DIR_INPUT. C:\pentahotraining\DataFiles\Input KTR_DIR_OUTPUT C:\pentahotraining\DataFiles\Output ‘© TIP: Provide an optional description for each parameter to help ‘others easily understand what they are used for. 5__| To close the ‘Transformation properties’ dialog, click [OK]. Continued on next page Copyright ©2015 Pentaho Corporation. Altrademarks ae the property oftheir respective owners (Course books may not he reproduced or distributed in whole oi par, without he plo wrten permission of Pentaho Teinng. ‘tu. penahe.com/sereeshvaling oem taining @pentaho com Page |44 Pentato Data Integration Fundamentals ‘Course Code D11000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create the Transformation, [~ Step ‘Action continued 6 To save the transformation: * Press CTRL-S, * At the ‘Transformation properties’ dialog, click [OK]. * Atthe ‘Enter a comment’ dialog, enter an optional comment, and then click [OK]. NOTE: Now as you work on your transformation, you can easily save your work. Create & Configure the Step Action ae File put 1 | To create the first step, from the Input category of the Design tab, ep drag the CSV file input step onto the Canvas. 2__| To open the step’s properties dialog, double-click the step. 3 | To set the step’s properties, set them according to the table shown below: Property Name Value Step Name Read customer data from CSV Filename S{KTR_DIR_INPUT)NOrder_File.csv Delimiter : Lazy conversion (unchecked) 4 | To configure the fields for this step: © Click the [Get Fields] button. ‘© At the ‘Sample size’ dialog, click [OK]. o_Atthe ‘Scan results’ dialog, click [Close]. Continued on next page Copyright © 2018 Pentaho Corporation. All trademarks are the property of ther expectve owner. Course books may nt be reproduced or dstribute, in whele or par, without the por wttten permission of Pentaho Training. _wotuentsho.com/serces/ining or ema training@aentahocom Page |45 Pentaho Data Yteration Fundamentals Course Code D11000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create & Configurethe [Step ‘Action CSV File Input 5 \verify your fields to the screenshot shown below: Step, continued £7 Name Type Format Length ieode ‘String 1 quantityordered Integer = priceesch Number 5 1 ordetinenumber Integer = + orderdate Date yaya HH mess $85 *—thippeddate ate YeyyIMM dd HH mess $55 Fatatus Sting 15 + customemumber Integer = 8 © county sting 0 IMPORTANT: The CSV file input step must always be configured to \dd all of the fields in the file. You cannot choose a subset of the fields ththe file and have it successfully read the data. 6 _|close the step’s properties dialog, click [OK]. 7_| Save the transformation. Create & Step Action Configure the 1 | To create the second step, from the Flow category of the Design tab, Suen/case drag the Switch/Case step onto the Canvas. 2 | Tocreate a hop between the steps: Click on the first step to select it. Click and hold on the first step using the middle mouse button. Drag and release the hop onto the second step. Choose Main output of step 3__| To open the step’s properties dialog, double-click the new step. Continued on next page Copyiht © 2015 Pentaho Corporation. Altrademarks ae the propery oftheir espectv owners. Course boots may not be reproduced or dtribued, in whole orn pa, without the por witen permission of Pentaho Tsing. ‘ses ntntaba com serces/eaining or eral tening@peataho orn Page |46 Pentaho Data Integration Fundamentals ‘Course Cade DI1000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create & Configurethe [Step ‘Aetion Switch/Case 4 | Toset the step’s properties, set them according to the table shown Step, continued below: Property Name Value Step Name Switch 7 Case on country Field name to switch country, Use string contains (checked) comparison Case value data type String 5 _ | Set the Case values grid, according to the table below: # Value Germany France Australia USA {11 NOTE: The remaining properties of this step will automatically get set as your build and configure the rest of your transformation. 6 _| To close the step’s properties dialog, click [OK]. 7_[ Save the transformation. Continued on next page Copyght © 2015 Pentaho Corgoraton All rademart are the property of thelr respective owners Course books mey not be reproduced or stbuted, in whole orn part, without the pir written permission of Pentaho Tralaing, we pentaho.com/serces/aning or emai [email protected] Page |47 enuto Bata Integration Fundamentals Course Code D11000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create & Step ‘Action Configure the 1 | To create the Germany Text file output step, from the Output Germany Text category of the Design tab, drag the Text file output step onto the File Output Step Canvas above and to the right of the Switch / Case step. 2__| Create a hop between the Switch/Case and Text file output steps. 3 | Set the Text file output step’s properties according to the table shown below: Property Value Name Step Name. ‘Germany Filename S(KTR_DIR_OUTPUT)\Country_Germany 4 Create a hop between the Switch/Case and Germany steps, and when prompted, choose The case target for value ‘Germany’ 5 | Toconfigure the field values for this step: Open the Germany step's properties dialog, Click on the Fields tab. Click (Get Fields]. Click {OK}. 6 _| Save the transformation. Create & Configure the ‘Step ‘Action Remaining 1 | Create three additional Text file output steps given the step names UtpuE Steps below and configure their properties Fields tabs as you did for the Germany step. France, Australia, USA. Continued on next page Copytight © 2088 Pentaho Corporation. All rademark are the property oftheir respective owners Course books may not be reproduced or distributed n whole o in part, without the prior written permission of Pentaho Tsing oe penta com/senvcestaning or emai: taning@gentahoxom Page |48 Pentaho Data Iteration Fundamentals ‘Course Code D11000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create & Configure the Remaining Output Steps, continued Create & Configure the Dummy Step Step | Action 2 _ | Create three new hops according to the table below: Source Step Destination Case Target Value Step ‘Switch / Case on country _| France France Switch /Case on country __| Australia Australia Switch / Case on country | USA USA 3 [Save the transformation. Step ‘Action 1 _ | Create a Dummy (do nothing) step and configur the table below: it according to Property Name Value Step name Throw out all others 2 | Create a new hop according to the table below: ‘Source Step Destination _| Case Target Step Value ‘Switch /Case on country [Throw outall | The default others target step 2s you configured the last steps and hops, open ose val aS Gumany France use usa ss Deteutarget step [Torowowtetoen 3 | To see how the Switch/Case step’s properties have been updated step's properties dialog and compare with the screenshot below: the Switch/Case Continued on next page ‘Copyright © 2015 Pentaho Corporation. Alltrademarks are the property of thelr respective owners ‘Course books may nat be reproduced or estbuted a whole on pan, without the prior writen persion of Pentzho Trang, inti sondatdufents ret sheesh Page [49 Pentaho Data Integration Fundamentals Course Cade DI1000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Create & Configure the Step Action Dummy step, 4 | The transformation should look similar to the screenshot below: continued a yor tee >" arto cB ‘it Cape yD S| Save the transformation. Execute the Transformation [Step ‘Action 1_| Torun the transformation, press F9, and then click [Launch]. 2 | There should be no errors and the Step Metrics tab should look similar to the screenshot shown below: NOTE: Notice the Output field contains data. This is indicates the number of lines output to the text files that it created. Continued on next page Copyright © 2015 Pentaho Corporation. ll trademarks ae the property oftheir respecte owners Course boots may not be reproduced or distributed in whole ori par, without hepa witten permission of Pentaho Trsining. ‘sw. pentao.com/sendceshvaiing or ene: aning@oentaha com Page |50 Pentaho Data Integration Fundamentals ‘Course Code DI1000 Exercise 2 — CSV Input to Multiple Text Output Using Switch/Case, Continued Execute the Transformation, [Step ‘Action continued 3 | To verify the text files were created, navigate to the folder shown below and verify the creation of the files as show in in the example screenshot: CA\pentahotraining\DataFiles\Output Name > Datemodified LY county Austra at gno72014 1205 PM ( county Fancesst 71072014 1205 pet CY County. Germany tt 9710720141205 pm 4 | Open each one of the text files and notice how the data corresponding to each country is written to the appropriate country's text file. Example of Country_Germany.txt Solution Details The solution to this exercise can be obtained using the details below: Location: C:\pentahotraining\solutions\Exercises Completed transformation: EX2_CsvInputTextOutput .ktr (Use File | Import from an XML file to import) End of Exercise Congratulations! You have completed this exercise. Copyrigt © 2015 Pentaho Corporation. ll tadomatks are the property oftheir respective owners Course books may not be reproduced or stributd, In whole orn part, without the prior written permission of Pentaho Tralang. ‘pentane confeencestaag renal alae peatocon Page |51 Pentaho Data Integration Fundamentals Course Code B11000 Exercise 2 Advanced — CSV Input to Multiple Text Output Using Switch/Case Introduction _This is an advanced version of Exercise 2. It is the same exercise, but without the detailed guidance. You may choose to do this advanced exercise rather than the standard version of Exercise 2. Instructions. Create a transformation that can do the following: ‘+ Read the Order_File.csv file located at C:\pentahotraining\Datafiles\Input ‘* Separate the order details by country. ‘* Output the order details to four text © Germany © France © Australia ° USA « All other countries order data should be discarded. © The text files should be written to the following location: entahotraining\DataFiles\Output * Use transformation parameters to define the input and output file locations. les by country as follows: Steps Used Use only the steps shown below to complete this exercise: © CSV file input * Switch / Case © Text file output * Dummy (do nothing) Continued on next page Copyright © 2015 Pentaho Corporation. Altrademark re the property oftheir respective owners. Course books may not be reproduced o distributed a whole or In prt, without the prior written permission of Pentaho Te ‘Miu pentane com eeovcestraning or eral: taning®gentsho comm Page [52 Peotaho Data Inegation Fundamentals ‘Course Code DIIO00 Exercise 2 Advanced — CSV Input to Multiple Text Output Using Switch/Case, Continued Finished ‘The completed transformation should resemble the screenshot shown Transformation below: r= ye, fl +e" sins oaipiagay AD NG {5 a my @.-- a owes ve vi Prerequisites You must have Pentaho Data Integration (or Pentaho Business Analytics suite) installed and properly configured. You must also have access to course files required (if any). End of Exercise Congratulations! You have completed this exercise. opyteht © 2015 Pentaho Corporation Al trademers ar the propeny of he respecte owners ‘cour boots may not be reroduced or dstbuted whale orn par witout he por wrten persion of Pentaho Teli "won pentsh.com/sen/ces/walg or eral taing@oentaho com Page |53 Penta Data lntegation Fundanentals Course Code DI1000 Exercise 3 — Serializing Multiple Text Files Introduction note Objectives Prerequisites In this exercise, you will create a transformation that reads the multiple text files you created in the previous exercise. Then, the data is sent toa serialize step that creates a binary (serialized) file. A regular expression is used, along with a parameter, to determine the correct files to read. For those looking for more of a challenge, try the advanced version of this, exercise. It is the same exercise, but without the detailed guidance. You will find it in this workbook immediately following this exercise. After completing this exercise, you will be able to: * Read multiple text files from a parameterized directory using the Text file input step. ‘Use a regular expression to specify a specific pattern of filename. * Create and configure a Serialize step to create binary file from an input stream. The multiple text files created from executing the previous exercise. Continued on next page Copyright © 2015 Pentaho Corporation Al rademark are the property of thei respective owners. Course books may not be reproduced o distributed, n whole or In part, without the ror written permission of Pentaho Telning. ‘bu. pentane conv/senices/usiing or eral: [email protected] Page |54 Pentaho Data Inepraton Fundamentals Course Code DIIOOO Exercise 3 — Serializing Multiple Text Files, Continued Create the ‘Transformation Step ‘Action To create the new transformation, in the Spoon menubar, click File | New | Transformation. To open the ‘Transformation properti double-click on an empty area. dialog, in the Canvas, Set the transformation properties for the Transformation tab according to the table below: Property Name Value Transformation SerializeMultipleTextFiles name Description ‘Add a description of your choice. Directory ‘/public/PDI_Trn_Objects D1 NOTE: This is in the repository. To close the ‘Transformation properties’ dialog, click [OK]. To save the transformation: © Press CTRL-S. * Atthe ‘Transformation properties’ dialog, click [OK]. * Atthe ‘Enter a comment’ dialog, enter an optional comment, and then click [OK]. Q NOTE: Now as you work on your transformation, you can easily save your work. Continued on next page Copyright © 2035 Pentaho Corporation. All trademarks are the property of thel respective owners. Course books may not be reproduced or dstributd, in whale orn part, without the por written permission of Pentaho Talng. ne pntaho.com/serv/ning orem Page [55 is [email protected] Pentaho Data Inegraton Fundam als Course Code DI1000 Exercise 3 — Serializing Multiple Text Files, Continued Create & Configurethe [Step ‘Action Text File Input 7 Step To create the first step, from the Input category of the Design tab, drag the Text file input step onto the Canvas. To open the step’s properties dialog, double-click the step. To set the step’s properties, set them according to the table shown below: Property Name Value. Step Name Read multiple text files Selected files: File/Directory ${DIR_OUTPUT) Selected files: Wildcard (RegExp) Country_.*\.txt To check that the source directory and filename regular expression are correct, and that the files exist, click the [Show filename(s)...] button. You should see the following dialog and content: \ emplete/Countiy Francetst (teenagers epee ene (oie ; I Click (Close). To configure the fields for this step: Click the Fields tab. Click the [Get Fields] button. At the ‘Sample size’ dialog, click [OK] At the ‘Scan results’ dialog, click [Close]. To preview the data, click the [Preview] button and verify the data returned contains all of the countries you used as input in the ‘country’ column. To close the step’s properties dialog, click [OK]. Save the transformation. Continued on next page Copyright © 2035 Pentaho Corporation. lltrademsrhs are the property oftheirrespective owners. Course books may not be reproduced or distributed, n whole of In pt, without the prior writen permission of Pentaho Teinng. ww. pentaho com/serices sng or eral talning®@pentsho.com Page |56 Pentaho Daa tnegation Fundamentals Course Code D11000 Exercise 3 — Serializing Multiple Text Files, Continued Create & Configure the Serialize Step {_Step Action 1 | To use the search box and find the next step to add to your transformation, from the Design tab, enter the following text into the search box: “seria” View(P Design DY De-seriatize from fite + © Output 1A Seriatize to file 2 | Toadd the step, from the Output category of the Design tab, drag the Serialize to file step onto the Canvas, To open the step’s properties dialog, double-click the step. 4 | To set the step’s properties, set them according to the table shown below: Property Name _| Value Filename S{DIR_OUTPUT)\CustomerData_Serialized 5 | Create a new hop according to the table below: Source Step Destination Step Read multiple text files Serialize to file 6 _| Save the transformation. Continued on next page Copyright © 2015 Pentaho Corporation. Al trademarks ate the property of thelr respective owners ‘Course books may not be reproduced or estibute, in wholeor In par, without the peor written persion of Pentaho Training wou. pentaho com/servies/alang or eral alning@pentaha com | Page |57 Pentaho Data Integration Fundamentals Course Cade D11000 Exercise 3 — Serializing Multiple Text Files, Continued Execute the Step Action Transformation [~ 7 Run the transformation. 2 | There should be no errors and the Step Metrics tab should look similar to the screenshot shown below: freon Reus tc apna ee Fete CuO Wal rn ‘nye Pai Ww bya Baga” Up ec) en he 3 | To verify the serialized file was created, navigate to the folder shown below and verify the creation of the file as show in in the example screenshot: C:\pentahotraining\DataFiles\Output Neme Date modified LL CustomerData Serialized 9/14/2014 5:87 PM Solution Details The solution to this exercise can be obtained u: g the details below: Location: C:\pentahotraining\Solutions\Exercises Completed transformation: EX3_SerializeMultipleTextFiles.ktr (Use File | Import from an XML file to import) End of Exercise Congratulations! You have completed this exercise. Copyright © 2035 Pentaho Corporation. lltrademark are the property of thelr respective owners ‘Course books may not be reproduced or distributed, n whole an part, without the prior writen permission of Pentaho Trlnng. wun gentsho.com/servces/taining or emal: traning 2entsho com. Page |58 Pentaho Data Intepration Fundamentals Course Code DII000 Exercise 3 Advanced - Serializing Multiple Text Files Introduction This is an advanced version of Exercise 3. It is the same exercise, but without the detailed guidance. You may choose to do this advanced exercise rather than the standard version of Exercise 3. Instructions Create a transformation that has the following functionality: * Read all files located in your parameterized output folder that match the following filename pattern: Begin with ‘Country_’ and have the ‘txt’ extension. * Use a regular expression for reading the correct file names. « Serialize all the files and output to your parameterize output folder. The output filename is: CustomerData_Serialized Steps Used Use only the steps shown below to complete this exercise: * Text file input * Serialize to file Finished The completed transformation should resemble the screenshot shown Transformation below: 8) Teste input Sea ole Prerequisites You must have Pentaho Data Integration (or Pentaho Business Analytics suite) installed and properly configured. You must also have access to course files required (if any). End of Exercise Congratulations! You have completed this exercise. Copyright © 2035 Pentaho Corporation. Altademaris are the property oftheir respective owners. course books may not be reproduced or dtd, n whole orn pa without he pir writen perisin of Pentaho Tein ‘tm pentaho.com/seovceshsning o ere [email protected] Page |59 Pentato Daa integration Fundamentals Exercise 4 — Introduction note Objectives Prerequisites Copyright © 2015 Pentaho Corpraton. Altradem Course Case 11000 De-serializing a File In this exercise, you will create a transformation that reads serialized file you created in the previous exercise and de-serializes. Then, the data is sent toa Text file output step. For those looking for more of a challenge, try the advanced version of this exercise. It is the same exercise, but without the detailed guidance. You will find it in this workbook immediately following this exercise. After completing this exercise, you will be able to: © De-serialize a file. * Create a text file output with mit fields. ial width delimited data and customized ‘The serialized file created from executing the previous exercise. Continued on next page re the property ofthetr respective owners ‘Course books may not be reproduced or stibuted,n whole on part, without the prior writen persion of Pentaho Tralning nm pentaho.conyservices/traong or ermal: [email protected], Page |60 Pentaho Data ltegaton Fundamentals Course Code DI00 Exercise 4 — De-serializing a File, Continued Create the Transformation [Step Action 1 _ | To create the new transformation, in the Spoon menubar, click File | New | Transformation. 2 | Toopen the Transformation properties dialog, in the Canvas, double-click on an empty area. 3. | Set the transformation properties for the Transformation tab according to the table below: Property Name Value Transformation | DeserializeFile name Description ‘Add a description of your choice. Directory ‘/public/PDI_Trn_Objects £Q NOTE: This is in the repository. 4 _ | To close the ‘Transformation properties’ dialog, click [OK]. 5 | To save the transformation: © Press CTRL-S. © At the ‘Transformation properties’ dialog, click [OK]. © Atthe ‘Enter a comment’ dialog, enter an optional comment, and then click [OK]. Continued on next page Copyright © 2085 Pentaho Corporation. Altrademari are the property of thei respectve owes ‘Course books may not be reproduced or distributed, in whol rn ar without the prior writen persion of Pentaho Trallng. vir pentaho.com/servcesraning or em: [email protected] Page |61 Pentaho Data lntegratfon Fundamentals Course Code DIL000 Exercise 4 — De-serializing a File, Continued ‘Action To create the first step, from the Input category of the Design tab, drag the De-serialize a file step onto the Canvas. To open the step’s properties dialog, double-click the step. To set the step’s properties, set them according to the table shown below: Property Name Value ‘Step Name De-serialize Country Order Data Filename C:\pentahotraining\DataFiles\Output\ CustomerData_ Serialized To close the step’s properties dialog, click [OK]. Save the transformation. Action To create the Text file output step, from the Output category of the Design tab, drag the Text file output step onto the Canvas above and to the right of the Switch / Case step. Create a hop between the De-serialize and Text file output steps. Create & Configure the Step _ De-serialize 7 step 2 3 4 5 Create & Step Configure the 1 Text file output step 2 3 Set the Text file output step’s properties according to the table shown below: Property Name Value Step Name ‘Combined Country Data to Text Filename ‘C:\pentahotraining\DataFiles\Output\C ‘ountryData_Deserialized Continued on next page Copyright © 2035 Pentaho Corporation. lltrademaris are the property oftheir respective owes, ‘Course books may not be repreducd or isributed,n whole on pat, without the prior writen permission of Pentaho Training, mw. pentaho com/senvcesuining or erat raning@penta Page |62 Pentaho Data ltegaton Fundamentals Course Code DIDO Exercise 4 — De-serializing a File, Continued Create & Configurethe [step Action Text file output 4 | To verify the correct filename will be created when this Step, continued transformation is run, click [Show filename(s).... IF Output file(s): * [EApentahotraining\Datailes\Output\ Countr/Date_Deseriaized bt e Close 5 | To add all of the fields contained in the input file to this step: * Click on the Fields tab. * Click [Get Fields). 6 | The text file you want to output will only contain a subset of these fields, because all are not needed. To configure the fields tab for only the fields needed, delete all but the following fields: ordernumber status country 7 | To configure the text file output to have no trailing spaces in each column’s data, click [Minimal width] To close the Text file output’ dialog, click [OK]. 9 [Save the transformation oo Continued on next page Copyright © 2015 Pentaho Corporation Al rademarks are the property of thir respective owners Course books may not be reproduced or disisute,n whe part witout the pro written permlson of Pentsho Tsing ‘sae. pentahe.com/seeves/valing or [email protected] Page | 63 Pentao Duta Integration Fundamenals ‘Coure Code D11000, Exercise 4 - De-serializing a File, Continued Preview the De- The transformation is nearly complete, but you may wish to preview data serializeddata that is coming from the De-serialize step before actually writing a text output inthestream file, Step Action 10 | To preview the data coming from the De-serialize step: ‘Select the “Text file output” step. ‘Click the Preview ™ button in the Canvas toolbar. ‘Click [Quick Launch] and preview the data. ‘ Click [Close] to close the ‘Examine preview’ data dialog. ‘ Click [Close] to close the ‘Select the preview’ step dialog. Continued on next page ‘Copyright © 2035 Pentaho Corporation. Alltrademerks are the property oftheir respective owners Course books may not be repreduced ar distributed, n whole or In part, without the prior writen permission of Pentaho Training ‘wn.pentaho.com senvce/taning or emaltraning@oentaho com Page [64 Pentaho Data Integration Fundamentals Course Code D11000 Exercise 4 — De-serializing a File, Continued Action Run the transformation. Execute the Step Transformation [~~ 4 2 There should be no errors and the ‘Step Metrics’ tab should look similar to the screenshot shown below: © TIP: The value in the ‘Output’ column shows that in fact data was output to a file or table. In this case 273 lines, were written to a text file. To verify the text file was created, navigate to the folder shown below and verify the creation of the file as show in in the example screenshot C:\pentahotraining\DataFiles\Output [ef CountryData_Deserialized.tt | 9/21/2014 12:27 To verify the text file contains the de-serialized (plain text) data, double-click the file. Example of the text file contents: ordernumber; status: count! 10120; ShippedsAustralia 10120; Shipped;Australia 10120; Shipped;Australia 10120; Shipped;Australia 10120; Shipped;Australia Continued on next page Copitight © 2015 Pentaho Corporation. ll trademarks are the property of thelr respective owners. ‘Course books may not be reproduced or dstrluted In whole or In part, without the prior writen permission of Pentaho Training ‘wt. pentaho.com/sendces/valing or emai training@ pentaho.com Page |65 Penta Data Iteration Fundamentals (Course Code D11000, Exercise 4 — De-serializing a File, Continued Solution Details The solution to this exercise can be obtained using the details below: Location: C: \pentahot raining\solutions\DeserializeFile\ Completed transformation: EX4_DeserializeFile.ktr {Use File | Import from an XML file to import) End of Exercise Congratulations! You have completed this exercise. Copyright © 2015 Pentaho Corporation. ll trademarks are the property oftheir respective owners. ‘Course books may not be reproduced or cistibutd, in whale In pat, without the prir writen permission of Pentaho Teining. mim. pentaho.com/senvcestraalng or eal: [email protected] Page |66 Pentaho Daa Integration Fundamentals Course Code DI1000 Exercise 4 Advanced — De-Serializing Multiple Text Files Introduction —_This is an advanced version of Exercise 4. It is the same exercise, but without the detailed guidance. You may choose to do this advanced exercise rather than the standard version of Exercise 4. Instructions Create a transformation that has the following functionality: * De-serialize the file output in Exercise 3, ‘CustomerData_Serialized’. * Use a Dummy (do nothing) step to preview the file and verify it was properly de-serialized. ‘* Output only the fields shown below to a text file in your output directory, © ordernumber © status © country Steps Used Use only the steps shown below to complete this exercise: * De-serialize from file * Dummy (do nothing) © Text file output note The “Dummy (do nothing)” step is optional. It is not necessary for the transformation to work correctly. Finished The completed transformation should resemble the screenshot shown Transformation below: i jit Prerequisites The serialized file created from executing the previous exercise. End of Exercise Congratulations! You have completed this exercise. Copytght © 2015 Pentaho Corporation. Altrade the propery oftheir respectve owners. ‘Cours books may not be reproduced o distributed im whale pa, withot the prior writen permison of Pentaho Traning. eu petaho.com/serviez/vaining or ema [email protected] Page |67 Pentaho Data integration Fundamertals Course Code DIIOW Guided Demo 7 — Connections & the Database Explorer Introduction Objectives Prerequisites Part I: Ad the MySQL Driver tothe PDI Environment In this guided demonstration, you will prepare your environment to use a MySQL database copying the MySQL driver to the proper location and starting the MySQL database. Then you will create a database connection and use the Database Explorer. In this guided demonstration, you * Understand the placement of database drivers and how to start the MySQL database server. * Create a database connection ‘* Use Database Explorer to interact with a data source ‘You must complete the prior exercise on installing and configuring Pentaho Data Integration in order to complete this exercise. You must also have access to course files required (if any). You will be using a MySQL database. Pentaho Data Integration does not ship with the MySQL database driver. In order for your environment to function correctly, this driver needs to be added to the PDI directory structure. Therefore, before this guided demo can really begin, you need to make sure the MySQL driver is in the proper location, and that the database server is started. ‘Step Action 1_| Save any of your current work and close Spoon. 2 | Locate the mysql-connector-java-5.1.17-bin jar file in the following folder: C:\pentahotraining\Software\MySQL Driver Continued on next page Copyright © 2015 Pentaho Corporation. Altrademarks are the propery oftheir respective owners. Course boots may not be reproduced ar dstibuted,n whole of In pat, without the prior writen permission of Pentaho Tealang. ‘ei pentaboom/serice/trsning or emelirslning@pentahacom Page |68 Pentaho Data Integration Fundamentals Course Code DII00O Guided Demo 7 - Connections & the Database Explorer, Continued Part I: Adding ‘the MySQL. Step | Action Driver to the 3 Copy the mysql-connector-java-5.1.17-bin.jar file into the following fo two folders: Environment, continued CAPentaho\design-tools\data-integration\lib C:\Pentaho\server\data-integration-server\tomcat\lib 4 Run the Start MySQL shortcut located in the following folder: CApentahotraining\Software\MySQL Driver Aconsole will open and the MySQL database server will start. If you are presented with a firewall screen, click (Allow). 6 CAUTION Do not close this window, otherwise MySQL will stop and you cannot connect to the MySQL database. Part Il: In part II of this exercise, you create a database connection in PDI. Database Connections Continued on next page 2pyrant © 2015 Pentoh Corporation. Altragemarts are the propery of he respective ounes. Cause books may not be reproduced or astbutee in whole or nar, without he pr writen pemison of Pentaho Tring. we. gentaho com/senices/aning or eral ania penta.com Page |69 Pentaho Data Integration Fundamentals CCouve Code D1L000 Guided Demo 7 - Connections & the Database Explorer, Continued Creating a To create a database connection in PDI: Database Connection Step ‘Action 1_| Open an existing transformation 2__| Save the transformation as ConnectExplore 3 | In Pentaho Data Integration, switch from the Design tab to the View tab, 4 | Right click on Database connection and select new. Fillin the dialogue box as follows, 5 | inthe ‘Database Connection’ dialog, type or select the following: Field Value Connection Name sampledata Connection Type Hypersonic Access Native (JDBC) Host Name localhost Database Name sampledata Port Number 9001 User Name pentaho_admin Password password Continued on next page Copyright © 2015 Pentaho Corporation. Al rademorks are the propery oftheir respective owners. Course boots may not be reproduced or distributed, in whole of np without the prior writen permission of Pentaho Tealaing. entaho.com/servces/traning or eral: [email protected] Page |70 Pentaho Data Integration Fundameatals ‘Course Code DIIO0D Guided Demo 7 - Connections & the Database Explorer, Continued Creatinga Database Step Connection 6 (continued) Onto sare @ orion tocauvse amples 0% | Vetoes tsaon Poa act Dataset 8 [Click [OK] to close the “Database Connection Test” dialog. Continued on next page Copyright © 2015 Pentaho Corporation. Al trademarks are the property oftheir respective owners. Course books msy not be repoduced or distributed In whole or In pa, without the pror writen permision of Pentaho Trini. vw pentaho.com/sences/training or eral [email protected] Page |71 Pentaho Data Integration Fundamentals Course Code DIIO00 Guided Demo 7 — Connections & the Database Explorer, continued Creating a Database Connection (continued) Part ili: Database Explorer ‘Step ‘Action In the ‘Database Connection’ dialog, click Feature List. Notice the connection properties. in particular, notice the Driver class and URL values. Ie Divers igh foci nw ihe lca 11 _| Click on the Parameter column header, that will sort the display by Parameters. 12 _| Click [OK] to close the “Database Connection” dialog. 13. | Onthe View tab, expand Database Connections, right-click sampledata. Save the transformation. In part Ill of this guided demo, you will use the Database Explorer tool to interact with a configured data source. Continued on next page ‘Copyright © 2015 Pentaho Corporation. All trademarks ae the property oftheir rexpecive owners ‘Course books may not be reproduced or distributed, In whole ot in pat without te prior writen permission of Pentaho Trang. ‘sr penta com serdces/iing orem aiing@ penta com Page |72 Peataho Daa Integration Fundamentals Course Cade D11000 Guided Demo 7 - Connections & the Database Explorer, continued Using Database To use Database Explorer: Explorer Step Action 1 If necessary, on the View tab, expand Database Connections. 2 Right-click sampledata and choose Explore from the menu options. 2 tetomaion i 12 tadombent + bubancomnses Ome C— os B test Carel Cancion Ds ba Dee sae hs Cncteateetanpe —_— Ee In the Database Explorer window, expand pentaho_olt Right-click the customers table and choose Preview first 100 from the menu options Examine the customer data. f appears. Type this SQL: | Smpesaledtar Gel ‘QL statements, spat by semicolon SELECT + Fi GRDER BY city ne 2 olan 13 Copssee) Coren) toe) k 7 Click [Execute]. Continued on next page ‘Copyright © 2015 Pentaho Corporation. Alltrademarks are the property of thelr respectve owners ‘Course books may not be reproduced or ditlbuted, In whole orn part, without the prior writen permission of wane penta com/sences/tainng 0 erat tranag@pentaho com Page | 73 Pentaho Bata Integration Fundamentals Course Code D11000 Guided Demo 7 — Connections & the Database Explorer Continued Using Database Explorer, 3 Right Click on the customers table and choose Truncate Table. continued 10 Note the Simple Sql Editor appears with the Truncate statement ‘commented out. Do not run the truncate, just close. ii Close the Simple SOL editor. (upper corner) 2 Right Click on the customers table and choose DDL. Then select choose current connection. 1B ‘A dialog box appears with a create statement. This is handy if you want to create a duplicate table in another Database. 4 Click [OK] to close Database Explorer. (You may need to expand the size of the the [OK] button). End of Guided Congratulations! You have completed this guided demonstration. Demonstration Copyright ©2035 Pentaho Corporation. Ailirademars are the propery of her respective owners. Course books may not be reproduced o tributed in whole oF in pan without the por writen permission of Pentaho Training. ‘Wie pentaho com/seeveestaning or email tranina@oentaho com Page |74 Pentaho Data Integration Fundamentals Course Code DILO00 Exercise 5 — Reading & Writing to Database Tables Introduction —_This exercise is designed to introduce you to various methods of interacting with databases using Pentaho Data Integration. First, you create a database connection, explore a data source, and create transformations that use various data input and output steps including: Table input, Table output, Text file output, and CSV file input. note For those looking for more of a challenge, try the advanced version of this exercise. It is the same exercise, but without the detailed guidance. You will this workbook immediately following this exercise. Objectives After completing this exercise, you will be able to: * Create two (2) database connections, pentaho_olap & pentaho_oltp. * Use Database Explorer to interact with a data source. ‘Create a transformation that uses the Table input and Table output steps. * Create a transformation that uses the Text file output step. Prerequisites You must complete the prior exercise on installing and configuring Pentaho Data Integration in order to complete this exercise. You must also have access to course files required (if any). Continued on next page Copyright © 2035 Pentaho Corporation. ll trademarks are the property of hel respective owners. Course backs may not be reproduced or ciztibuteg, in whole orn par, without the peor written permision of Pentaho Teinng. ‘we. pentaho.com/sercesteling orem tein pentaho.com Page |75 Pentaho Duta Integration Fundamertals Course Code D11000 Exercise 5 — Reading & Writing to Database Tables, Continued Part I: Database _In part | of this exercise, you create a database connection in PDI. Connections CreatingTwo To create the database connections in PDI: Database Connections Step [Action 1 _| Start Spoon. 2 | Choose File | New | Transformation to create a new transformation. (You can also click the New File icon and choose Transformation). 3. | Save the transformation in the reps ‘TableInputOutput’. 4 | In Pentaho Data Integration, switch from the Design tab to the View tab. Ob sob [Bite 2 basen) Bolo 5 _ | Expand Transformations > TablelnputOutput, right-click Database connections and choose New. Fy using File name: Rind on net page Copyright © 2015 Pentaho Corporation. ll trademarks are the property ofthc respective owes ‘Course books may not be reproduced or distributed, In whole orn pat, without the prior writen permission of Pentaho Teining ‘ow pentaho.com/serces/training or emai: taining @pentaho com Page |76 Pentaho Data Integration Fundamentals Course Code DIN000 Exercise 5 — Reading & Writing to Database Tables, Continued Creating Two Database Step | Action Connections 6 In the ‘Database Connection’ dialog, type or select the (continued) following: Field Value Connection Name. pentaho_oltp ‘Connection Type Myst ‘Access Native (JDBC) Host Name localhost Database Name pentaho_oltp Port Number 3999 User Name. pentaho: Password pentaho *) ‘einen ° mac ° (Sete 5) me ° cau etter sabe 1) ease ° as ° = veto sig Cee ferent) po} 7 jest. A pop-up dialog shows the test result: Tonnection fo database [eentaho_ tp] 6 OK. Fostneme:loeahost Port” 399 Satabase name & | Click [OK] to close the “Database Connection Test” dialog and click [OK] to close the “Database Connection” dialog. Continued on next page Copyright © 2035 Pestaho Corperaton.Altrdomark are the property oftheir respecthe owners. Course books may not be reproduced o trout, in whole orn part without he plorwrlten permission of Pentaho Training. ‘su pentahe com servces/rlalngo ema ralnng@aentahocom, Page |77 Pentaho Daa Integration Fundamental Exercise 5 — Reading & Writing to Database Tables, Continued Creating Two Database Connections (continued) Step ‘Action 9 | Onthe View tab, expand Database Connections, right-click pentaho_oltp and choose Share from the context menu. When a connection is shared, it appears in bold text. 10 | Repeat the previous steps to create a connection to the pentaho_olap database using the following values: Field Value Connection Name pentaho_olap Connection Type MySQL ‘Access Native (JDBC) Host Name localhost Database Name pentaho_olap Port Number 3999 User Name. pentaho Password: pentaho 1 Connection Name: _ [pentahoeap SY ‘connection Type satogs - Hest Name: a Databare Name: pntaho slop Pot Nunber: la ] “=f | User tame: | perth Password: | Continued on next page Copyright © 2035 Pentaho Corporation. Al redemorks ore the property oftheir respective owners. ‘Course boots may net be reproduced or distributed, n whole on part, without the prior writen permission of Pentaho Tranig. si ntaho con series traning or emai traning@pentahocom Page |78 Course Code DILO00 Pentaho Data Integration Furdamectals ‘Courre Code D11000 Exercise 5 — Reading & Writing to Database Tables, Continued Creating Two Database Connections (continued) Step ‘Action 12 Click on the Advanced tab on the left side of the dialog, and then unclick the first two check boxes. Efereatto ppm ease Elrcenecuect reve vers 13_| Click Test to verify the connection. 14 _| Click [OK] to close the "Database Connection Test” dialog. 15 | Double-click the pentaho_olap connection to open it. 16 _| Inthe ‘Database Connection’ dialog, click Feature List. 17 | Notice the connection properties. In particular, natice the Driver class and URL values. 18 _| Click [OK] to close the “Database Connection” dialog. 19 | On the View tab, expand Database Connections, right-click pentaho_olap and choose Share from the context menu. Notice that when the connection is shared, it appears in bold text. 20 _| Save the transformation. Continued on next page Ccopright © 2015 Pentaho Corporation. All tradamarks ar the property oftheir respective owners Course books may not be reproduced or dstibute, in whale orn par without the prior written permision of Pentaho Training ows. pentaho com/serices/traning or emt taining ®pentaho.com Page |79 Pentato Dua Integration Fundamentals Course Code DINO Exercise 5 — Reading & Writing to Database Tables, Continued Part In part Il of this exercise, you use the Database Explorer tool to interact with Database a configured data source. To use Database Explorer: Explorer Using Database [Step Action Explorer 1_|If necessary, on the View tab, expand Database Connections. 2 _ | Right-click pentaho_oltp and choose Explore from the menu options 3 _| In the Database Explorer window, expand pentaho_oltp > Tables. 4 | Right-click the customers table and choose Preview first 100 from the menu options Examine the customer data and click Close when you are finished 6 | As time permits, explore the data in the other tables in pentaho_oltp. 7 | Click [OK] to close Database Explorer. (You may need to expand the size of the dialog to view the (OK] button). a Partili: Using In part Ill of this exercise, you create a transformation that uses the Table Table Input & input and Table output steps. The transformation is used to transfer data Output between databases — a common procedure in data warehousing. Using Table ‘To create a transformation using Table input and output steps: Input & Input [Step ‘Action Steps 1___| Switch to the Design tab. Drag an Input > Table input step onto the canvas. Drag an Output > Table output step onto the canvas. Create a hop between Table input and Table output. Double-click Table input. In the Table input’ dialog, for Connection, choose pentaho_oltp from the drop-down list. Click the Get SQL select statement button, 8 | Inthe ‘Database Explorer’ dialog, expand pentaho_oltp > Tables, select orderdetails and click [OK]. alalalwlr Continued on next page ‘Copyriht © 2035 Pentaho Corporation. Altrademarks are the propery of her respective cones. Course books may nat be reproduced or dstbuted,n whole orn par, without ie par wren permlsion of Pentaho Training. wow pentaho.com/serveaaning o eral: abning@pentaho com Page |80

You might also like