0% found this document useful (0 votes)
27 views22 pages

Learn Excel 17

LearnExcel

Uploaded by

binukumar100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views22 pages

Learn Excel 17

LearnExcel

Uploaded by

binukumar100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Learn Excel from Mr Excel - Week 17

Learn Excel from Mr Excel


Copyright 2005 Bill Jelen
All Rights Reserved
Encourage your friends to sign up at
http://www.mrexcel.com/learn-excel.html
This week - four topics that deal with imperfect data.
The first two topics introduce concepts that come in
handy when solving the problem in topic 3.

Part 3: WRANGLING DATA

461

Additional Details: You might want to assign a


numeric format of 0000 to the result, as shown
in Fig. 809.
Summary: Consolidation is one method for identifying who is in which list when you have many
lists of data.
Cross Reference: Pivot tables also do a great job
at this task.
Commands Discussed: Data Consolidate

Fig. 809

Number Each Record for a Customer,


Starting at One for a New Customer
Problem: You have a list of invoice data, as shown in Fig. 810. You wish
to number the records in such a way that the first invoice number for
Ford is one. The next Ford invoice is two, and so on. When you get to a
new customer, you want to start over at one.

Fig. 810

LEARN EXCEL FROM MR EXCEL

Part
III

462

Part 3: WRANGLING DATA

Strategy: Sort the data by Customer. Insert a new temporary column A


and add a heading called Rec #, as shown in Fig. 811.

Fig. 811

The formula in A2 is =IF(C2=C1,1+A1,1). In plain language, this formula says, If the customer in C is equal to the customer above me, then
add 1 to the cell above me. Otherwise, start at 1.
Result: Each group of customer invoices is numbered from 1 to N, as
shown in Fig. 812.

Fig. 812

LEARN EXCEL FROM MR EXCEL

Part 3: WRANGLING DATA

463

As shown in Fig. 813, copy the formulas in column A and Paste


Special Values to change to numbers so that you can re-sort the data
by Invoice Number.

Fig. 813

Alternate Strategy: A formula of =COUNTIF($C$2:C2,C2) will work


without sorting.
Summary: The IF function is perfect for this task of comparing the current record to the record above.
Functions Discussed: =IF()

Add a Group Number


to Each Set of Records
with a Unique Customer Number
Problem: You have a list of invoice data. You wish to number the records in such a way that the invoices for the first customer all have a
group number of 1. The invoices for the next customer all will have a
group number of 2.
Strategy: Sort the data by customer. Add a new column A, with the
heading of Group. In cell A2, enter the number 1 for Group #1. In cell
A3, enter the formula that will be used for the rest of the records:
LEARN EXCEL FROM MR EXCEL

Part
III

463

Part 3: WRANGLING DATA

As shown in Fig. 813, copy the formulas in column A and Paste


Special Values to change to numbers so that you can re-sort the data
by Invoice Number.

Fig. 813

Alternate Strategy: A formula of =COUNTIF($C$2:C2,C2) will work


without sorting.
Summary: The IF function is perfect for this task of comparing the current record to the record above.
Functions Discussed: =IF()

Add a Group Number


to Each Set of Records
with a Unique Customer Number
Problem: You have a list of invoice data. You wish to number the records in such a way that the invoices for the first customer all have a
group number of 1. The invoices for the next customer all will have a
group number of 2.
Strategy: Sort the data by customer. Add a new column A, with the
heading of Group. In cell A2, enter the number 1 for Group #1. In cell
A3, enter the formula that will be used for the rest of the records:
LEARN EXCEL FROM MR EXCEL

Part
III

Part 3: WRANGLING DATA

=IF(C3=C2,A2,1+A2)
In plain language, this formula says, If the customer on this row equals
the row above, then use the group number on the row above. Otherwise,
add one to the group number above. Copy this formula down to all the
other rows, as shown in Fig. 814.

Fig. 814

Result: The records are all assigned a group number. Each customer
has a unique group number.
In order to allow future sorting, copy the formulas in column A and use
Paste Special Values to convert the formulas to numbers.
Summary: Use the IF function to add a group number to each group of
records.
Functions Discussed: =IF()
LEARN EXCEL FROM MR EXCEL

464

465

Part 3: WRANGLING DATA

Deal with Data Where Each Record


Takes Five Physical Rows
Problem: Sometime back, in the days of COBOL, a programmer was
dealing with the constraints of the physical width of a page. The programmer built a report where each record actually took up five lines of
the report, as shown in Fig. 815. You now want to analyze this data in
Excel.

Part
III

Fig. 815

Strategy: Your goal is to get the data back into one row per record. This
process is possible. The process involves adding two new columns, one
called Group number and one called Sequence.

LEARN EXCEL FROM MR EXCEL

Part 3: WRANGLING DATA

1)

First, add a new Row 1. Insert two new columns A and B. Add
headings as shown in Fig. 816 in A1:C1. The headings should be
Group, Sequence, and Text.

Fig. 816

2)

In column A, assign a Group number to each logical record.

One way to do this is to check to see if the first four characters of


column C are ACCT. If this is true, add one to the group number. In A2, enter the number 1. In A3, enter the formula =IF(LE
FT(C3,4)=ACCT,1+A2,A2). Copy it down to all of the rows. This
will neatly assign each logical group of records a group number, as
shown in Fig. 817.

Fig. 817

LEARN EXCEL FROM MR EXCEL

466

Part 3: WRANGLING DATA

467

3)

Next, design a formula for a sequence number.

In cell B2, enter the formula =IF(A2=A1,B1+1,1). Copy this down.


This formula will number each record in the group, as shown in
Fig. 818. It should ensure that all of the Account numbers are on a

Part
III

Fig. 818

Sequence 1 record.
4)

This step is critical. Copy the formulas in columns A and B and


paste them back, using Paste Special Values. This will ensure that
you can safely sort the data.

LEARN EXCEL FROM MR EXCEL

Part 3: WRANGLING DATA

5)

Sort the data by the Sequence number in column B. Your data will
look like Fig. 819.

Fig. 819

You have now managed to intelligently segregate the data so that


all similar records are together. A contiguous range of C2:C7 contains all of the first rows from each record. All of the line 1 records
have three fields that really should be parsed into three separate
columns. You can easily do this with the Text to Columns Wizard.

6)

Select cells C2:C7. From the menu, select Data Text to Columns.
Select Fixed Width and Next, as shown in Fig. 820.

LEARN EXCEL FROM MR EXCEL

468

Part 3: WRANGLING DATA

469

Fig. 820

7)

Excel should properly guess where your columns are, as shown in


Fig. 821. Click Next.

Fig. 821

LEARN EXCEL FROM MR EXCEL

Part
III

Part 3: WRANGLING DATA

8)

Choose the heading for each column and define a data format. You
dont really need the word ACCT each time, so choose to Skip the
first, third, and fifth fields. Make the sixth field a date. When your
information looks like Fig. 822, choose Finish.

Fig. 822

9)

You will have data in three columns of Group 1. As shown in Fig.


823, change the heading in C1 to be Acct, the heading in D1 to be
Inv, and the heading in E1 to be Date.

Fig. 823

LEARN EXCEL FROM MR EXCEL

470

471

Part 3: WRANGLING DATA

10) Next, select A8:C13. Cut and paste in F2. Add headings in F1:H1 of
Group, Seq, and Total, as shown in Fig. 824.

Fig. 824

11) Select H2:H6 and choose Data Text to Columns. In Step 1, select
Fixed Width and choose Next. In Step 2, Excel offers to split your
data into three fields. There is no need to have one column for the
word Invoice and another column for the word Total. As shown in
Fig. 825, double-click the line between Invoice and Total to delete
the line.

Fig. 825

LEARN EXCEL FROM MR EXCEL

Part
III

Part 3: WRANGLING DATA

12) After double-clicking the first line, it is deleted. Choose Next, as


shown in Fig. 826.

Fig. 826

13) In Step 3, choose to skip the field with Invoice Total and choose
Finish, as shown in Fig. 827.

Fig. 827

LEARN EXCEL FROM MR EXCEL

472

473

Part 3: WRANGLING DATA

14) Select the Group 3 records, as shown in Fig. 828.

Fig. 828

15) Copy them to I2. The headings in I1:K1 are Group, Seq, and Name,
as shown in Fig. 829.

Part
III
Fig. 829

16) Select the Group 4 records. Cut and paste in L2.


17) Select the Group 5 records. Cut and paste in O2.
18) As shown in Fig. 830, the Group 6 records have no data they are
just dashed lines. You can delete these rows.

Fig. 830

LEARN EXCEL FROM MR EXCEL

Part 3: WRANGLING DATA

You now have all the fields, one line per record. You also have the words
Group and Sequence taking up about five columns each. Before you delete the Group and Sequence columns, lets make sure that everything
worked correctly. The Group numbers in columns A, F, I, L, and O should
all match.
19) As shown in Fig. 831, in a blank column at the end, enter a large
AND function as shown below. Copy this formula down to all
rows:

=AND(A2=F2,F2=I2,I2=L2,L2=O2)

Fig. 831

20) A value of True means that you have successfully put all of the
Group 1 records back together. To test if all of the rows have True,
enter the formula =AND(R2:R99) in cell R1. As shown in Fig. 832,
if this formula is True, then youve crosschecked that all of the rows
match up.

Fig. 832

LEARN EXCEL FROM MR EXCEL

474

475

Part 3: WRANGLING DATA

21) At this point, you can delete the columns that you dont need. As
shown in Fig. 833, delete columns R, P, O, M, L, J, I, G, F, B, and
A.

Fig. 833

Result: You now have a sortable, filterable, and reportable version of


the original dataset. Each record consists of one row in Excel, as shown
in Fig. 834.

Fig. 834

Summary: This process is convoluted. However, if you are presented


with data as shown in the original example, the only way to quickly add
up figures or to produce a report is to follow steps similar to the ones
shown in this topic.
Commands Discussed: Data Text to Columns
Functions Discussed: =IF(); =AND(); =LEFT()

LEARN EXCEL FROM MR EXCEL

Part
III

Part 3: WRANGLING DATA

Add a Customer Number


to Each Detail Record
Problem: Youve imported a dataset where they list the customer number once in column A and then have any number of invoice detail records.
At the end of the first customer, they put the next customer number in
column A and then have the detail records for that customer, as shown
in Fig. 835. This is a common data format, but it is horrible in Excel.
You cannot sort this dataset. You need to add the customer information
to each record.

Fig. 835

Strategy:
1)

Insert new columns A and B. Add headings of Acct and Customer,


as shown in Fig. 836.

Fig. 836

LEARN EXCEL FROM MR EXCEL

476

Part 3: WRANGLING DATA

477

Here is the basic logic of what Excel want to do in plain language:


Look at the first four characters of column C. If they are equal to
Acct, then this row has customer information. Take data from
that cell and move it to column A. Otherwise, if the first four characters are anything other than Acct, use the same account information from the previous rows column A.

The formula to do this for cell A2 is as follows:

=IF(LEFT(C2,4)=Acct,MID(C2,6,5),A1)

2)

Enter this formula into cell A2 and copy it down through column
A.

As shown in Fig. 837, as you copy this formula down, it does the job.
In cell A2, the IF condition is true and data is extracted from C2.
In cell A3, the condition is not true, so the value from A2 is used.
Down in cell A7, a new customer number is found so the data from
C7 is used in A7. Cells A8 through A59 get the customer number
from A7.

Fig. 837

Similar logic is needed in column B. In this case, though, you need


to grab the customer name. You know that the word Acct and the
space that follows it take up five characters. You know that your
account number is another five characters and then there is a space
before the customer name. Thus, you want to ignore the first 11
characters of cell C2.
LEARN EXCEL FROM MR EXCEL

Part
III

Part 3: WRANGLING DATA

You can use a formula of =MID(C2,12,50) to skip the first 11 characters and return the next 50 characters of the customer name.
However, this formula will add spaces to the end of the customer
name to ensure that you have 50 characters in the result. You really dont want all of those trailing spaces. The =TRIM() function
will remove leading and trailing spaces from text.

Use =TRIM(MID(C2,12,50)) as the formula to extract a customer


name. Use this formula as the True portion of the IF function. As
shown in Fig. 838, the formula in B2, copied down through column
B, is as follows:

=IF(LEFT(C2,4)=Acct,TRIM(MID(C2,12,50)),B1)

3)

Enter this formula into cell B2 and copy it down through column
B.

Fig. 838

LEARN EXCEL FROM MR EXCEL

478

Part 3: WRANGLING DATA

479

You have now successfully filled in account and customer. You need
to change these formulas to values.

4)

Highlight columns A and B. Hit Ctrl+C to copy. Choose Edit Paste


Special Values to convert the formulas to values.

The last task is to remove all of the customer heading rows. As you
look for a method to isolate the heading rows, you will notice that
heading rows are the only rows with blank cells in column D. You
can move the blanks to the end of a dataset by sorting the data by
column D.

5)

Select the heading in D1. Hit the AZ sort button to sort ascending
by Date. Any rows without a value in column D will automatically
sort to the bottom of the dataset, as shown in Fig. 839.

6)

From D1, press the End key and then the Down Arrow key twice.
The cell pointer will be located on the first customer heading. Delete all of the rows below row 564.

Fig. 839

LEARN EXCEL FROM MR EXCEL

Part
III

Part 3: WRANGLING DATA

Result: You have a clean dataset with customer information on every


row, as shown in Fig. 840. You can sort this data and otherwise use it
for data analysis.

Fig. 840

Summary: A couple of formulas with IF functions help to snap this data


into shape.
Commands Discussed: =IF(); =LEFT(); =MID()

Use a Pivot Table


to Summarize Detailed Data
Problem: You have 50,000 rows of sales data, as shown in Fig. 841.
You want to produce a summary report showing sales by region and
product.

Fig. 841

LEARN EXCEL FROM MR EXCEL

480

You might also like