Insights into cM Patterns

Featured

I now have over 8,700 Matches at AncestryDNA with a confirmed Common Ancestor (CA) with me between 2C and 8C. See my Common Ancestor Spreadsheet post here. That’s a lot of data, so I thought I’d do some analysis. In 2024 I posted (here) my averages for 3C to 8C which roughly agreed with the Shared cM project.

Below is a table summarizing all of my data (including full cousins, half cousins and removed cousins). For each relationship there are columns for the number of Matches, the average cMs, the lowest cM, the highest cM; plus the number of generations (meiosis events), and average cMs for each. The table is then repeated with a sort based on meiosis events.

A word about meiosis events. They are the count from me up to the CA and then back down to the Match. Like generations… A 1C is 4 events (two up to grandparent (the CA) plus two back down to the Match. The number of meiosis events with a 1C2R is 6 (two up and 4 down). A half relationship adds one to the meiosis events – eg a 4C1R is 11 events; and 4C1Rh is 12 events. These are important because in a mathematical simulation, each event reduces the cM by half. From the Shared DNA Project a 1C (4 events) average is 866cM compared to 2C1R (7 events) is 122cM which is roughly 866 halved three times. Remember, it’s an order of magnitude thing. And, as we shall see, it generally works for close relationships (like 1C and 2C), but drifts away for more distant relationships (like 4C and beyond). Important: this is not biology’s fault, it’s the math’s fault. It’s because we have a LOT of true distant cousins that do NOT share matching DNA with us; and they are not reflected in the averages. This (lack of a normal curve) is highlighted in the second sort (by meiosis numbers) below. This is also reflected in the DNA Painter Shared cM Project tool which shows different groups of Matches for a given input cM value. For example at DNA Painter, plug in 55cM… the 29% group of 3Ch, 3C1R, 2C3R and 2C2R half are all 9 meiosis events; and the second group of 4C, 3c1Rh, and 3C2R are all 10 meiosis events. This also demonstrates that by the time we get down to 3C and 4C levels there is a lot of overlap.

For this first table, the takeaway is that the number of Matches with CAs increased dramatically with each generation. [Note I combine full cousin with cousin 1R because at my age, most Matches will be a generation younger that me] 3C & 3C1R: 196 Matches; 4C & 4C1R: 662 Matches; 5C & 5C1R: 1,406 Matches; 6C & 6C1R: 3,426 Matches. WOW, what an increase in the number of Match cousins. And then we have 7C & 7C1R: 584 Matches; 8C & 8C1R: 363 Matches. What happened? Why the steep decrease in numbers. Well, IMO, the major factor is that AncestryDNA’s ThruLines quits at 6C – ThruLines can “see” into private Trees (I cannot); and it roots out MRCAs with the smallest of Trees (I don’t have that time). I can only dream of how many ThruLines I’d get at the 7C and 8C levels. Some of the ones I have now, were found/recorded when we had Circles at Ancestry.

The point is: there are LOTS of cousins still waiting to be determined. ProTools is helping.

Table 1: 8,799 AncestryDNA Matches Summarized by Relationship

AncestryDNAcMcMcM  
MRCA#Matchesavglowhighmeiosis
1C2Rh3138782007
2C12696
2C1R14127342207
2C2R847391628
2C3R234221409
3C5763132088
3Ch52016959
3C1R1392861489
3C1Rh2628711110
3C2R1062266810
3C2Rh202069211
3C3R342265811
3C3Rh122384012
3C4R12012
3C4Rh21081213
4C12824622010
4Ch71261911
4C1R53420611411
4C1Rh331663012
4C2R2671679212
4C2Rh121263913
4C3R271664413
4C4R117171714
5C4691666212
5Ch291762714
5C1R11371466013
5C1Rh71466014
5C2R3001464114
5C2Rh91472715
5C3R751464015
5C3Rh118181816
5C4R210101016
6C19221265614
6Ch971162515
6C1R15031265215
6C1Rh581062216
6C2R6181264416
6C2Rh471262917
6C3R121563017
7C2621364116
7Ch101563917
7C1R3221264317
7C1Rh71764318
7C2R171662518
7C3R51561819
8C3101263518
8Ch61071719
8C1R531663719
8C2R121781920
8C3R71061321
9C631462420
Total8799     

For the second table; the takeaway is that the average cM tracks pretty close to each other at the same meiosis numbers. And after meiosis level 9 which averages 27cM; the “curve” quickly “flatlines” in the mid teens. This is reflected at DNA Painter with many relationships all in play under 20cM.

Table 2: 8,799 AncestryDNA Matches Summarized by Meiosis Events

AncesttryDNAcMcMcM    
MRCA#Mavglowhighmeiosisavg cM
2C12696269
1C2Rh3138782007 
2C1R14127342207132
2C2R847391628 
3C576313208855
2C3R234221409 
3Ch5201695927
3C1R1392861489 
3C1Rh2628711110 
3C2R106226681025
4C12824622010 
3C2Rh202069211 
3C3R34226581118
4Ch71261911 
4C1R53420611411 
3C3Rh122384012 
3C4R12012 
4C1Rh33166301218
4C2R2671679212 
5C4691666212 
3C4Rh21081213 
4C2Rh12126391313
5C1R11371466013 
4C3R271664413 
4C4R117171714 
5Ch291762714 
5C1Rh7146601415
5C2R3001464114 
6C19221265614 
5C2Rh91472715 
5C3R75146401513
6Ch971162515 
6C1R15031265215 
5C3Rh118181816 
5C4R21010101613
6C1Rh581062216 
6C2R6181264416 
7C2621364116 
6C2Rh471262917 
6C3R12156301713
7Ch101563917 
7C1R3221264317 
7C1Rh71764318 
7C2R17166251815
8C3101263518 
7C3R51561819 
8Ch6107171913
8C1R531663719 
8C2R12178192015
9C631462420 
8C3R71061321 
Total8799       

Sidebar – this evaluation also acts as a Quality Control indicator. Watch for data points way outside the norms. I had three Matches who skewed one of the numbers. I went back to them – they were close to each other and I was sure they were from an NPE. Upon reevaluation, they needed to be a generation closer to our CA. I made the shift, and all the numbers fell back into the norm.

These insights are helping me with a new review of Walking The Clusters Back, where in I need to use judgment when imputing relationships and CAs.

[06G] Segment-ology: Insights into cM Patterns; by Jim Bartlett 20260122

How Old Are Your Segments?

Featured

Well, it depends… Your chromosomes are very large segments, which are not very old at all.  On the other hand, I have some small DNA segments from Neanderthal Ancestors – pretty old. In general, the smaller the segment, the older it is. But let’s think about this for a moment.

 This discussion will be about your DNA segments – large segments from close relatives to ever smaller segments from more and more distant relatives. They are all part of the DNA you inherited from your Ancestors. Segments are formed at the moment of conception – when sperm meets egg – about nine months before you were born. They don’t change until you pass them on – after recombination and new crossovers – to the next generation. So our unit of “age” measurement is a generation.

So, let’s start with the largest “segments” – your 44 autosomes passed to you by your parents. How old are these 44 chrsomosomes? Well, they are 0 generations old. You are the first person to ever have each of these specific – full chromosome – segments.*

Then let’s look at your grandparent segments that make up your chromosomes. On average you have 22 chromosomes, subdivided by 34 crossovers, for 56 grandparent segments per Side. These were each part of full, new, chromosomes passed from your grandparents to your parents; and then, one generation later, passed to you by a parent – they are 1 generation old. Again, due to random recombination for every child, you are the first person to ever have these specific segments.*

Similarly, your great grandparents, passed new chromosomes to your grandparents, who passed segments to your parents who passed segments to you, which would be unique and 2 generations old.*

You get the picture. The unique segments in each of your Ancestors are recombined into new segments and passed down – generation after generation. Your segments are “imbedded” in the chromosomes and large segments they passed down. And knowing the genealogy of each segment, we can count the generations to find their age – always one less than the number of Ancestor generations back.*

* So what’s with that pesky asterisk? In short, “sticky” segments. Some segments are passed down intact – they are exactly the same segment in an Ancestor and their child (who is also your Ancestor) – they were not subjected to a recombination crossover. More likely than not, one of your smaller chromosomes (Chr 18 to 22) was passed from a parent to you intact. So, in that particular case, it’s age is 1 generation (not 0 generations like all the other chromosomes). And this happens to some of the other segments passed down at each generation. Above we noted that you got about 56 grandparent segments from one parent. When you pass these to your children, recombination will create about 34 new crossovers. In general, they will be subdivisions of 34 of the 56 grandparent segments passed down to you – leaving 22 grandparent segments intact. You only pass half of your DNA to each child, but that still includes about 11 grandparent segments which are now 2 generations old!

It gets complicated real quick!

This is one of the reasons that as segments get smaller, the range of possible relationships increases. A given segment may have persisted for several generations, or not.

Chromosome Mapping of segments with MRCAs let’s us figure this out. Even if our Map is not complete, at least in some areas of our chromosomes can be figured out. Someday… it will be interesting to try to determine a Shared cM Chart which figures in the age of the segment. I’ll bet the ranges would be somewhat smaller…

[O5H] Segment-ology: How Old Are Your Segments? by Jim Bartlett 20251218

Shared Segment Spreadsheet Incarnations

Featured

For me, the Shared Segment Spreadsheet is a critical tool, which evolves through four incarnations.

1. It starts as a collection of all your shared DNA segments – from each company. This also means a collection of all your Matches (except AncestryDNA), some with multiple shared segments. It can be searched and sorted.

2. Use as a segment Triangulation tool. Sort on: Company + Chromosome + Segment Start to arrange all the shared segments (within a company) into Chromosomes. And within a Chromosome they are arranged so that overlapping segments are close to each other.  With this “view” each segment is Triangulated with other overlapping segments, or not. Maternal and Paternal Triangulated groups are formed*. Some of the under-15cM segments will not Triangulate and are labeled “false” and deleted or moved out of this spreadsheet –  “everybody’s got to be somewhere.”  This process is repeated for each company.

3. Form/identify Triangulated Group (TG) segments. Sort on: Side + Chromosome + Segment Start to separate the maternal and paternal segments and sort them in order within each chromosome. Since this spreadsheet is comparing all these shared segments with your own DNA segments, the shared segments from different companies will “break” into TG segments that align with your own segments. However, this phase of the process requires some judgment – the data is a little fuzzy and the ends of TGs will not be precise. You have to make a call. In general, to align with your DNA segments, each TG will end at the same Mbp as the next one starts. Make those calls and assign a TG Identification (TG ID)** for each segment. Make a TG segment header row for each one (I have 372 TG segments) that lock in the overall TG start and end positions and TG ID. TIP: make the TG header start location 0.01Mbp less than the first shared segment in the TG – so it sorts on top of the individual segments. Remember that every Match in a TG is related to you on your line back to a specific Common Ancestor (CA). Note: some small segments in a TG may go back further.

4. Use these TG groups to do the genealogy! Among the Matches, find the consensus path to the CA.

Summary: A shared segment spreadsheet has several uses – collection > Triangulation > TG ID > genealogy. The TG segment is your DNA segment. This covers all of your genetic genealogy, but you can always focus on one or more individual TGs, if you don’t want to eat the whole elephant at one time.

*I’ve covered the Triangulation process in other blogposts, and won’t repeat that here – this blogpost is about the three incarnations of the spreadsheet.

**I’ve covered TG IDs in other blogposts

[35BBa] Segment-ology: Shared Segment Spreadsheet Incarnations by Jim Bartlett 20251102

Walk The Clusters Back with AncestryDNA

Featured

AncestryDNA has just rolled out enhancements to their Clustering Program that let you “Create custom clusters”. At AncestryDNA > DNA > Matches > By Clusters/Pro > Create custom clusters. You must have the additional subscription for ProTools to access this program.  I have not run it through its paces yet, but I wanted to review the Walk The Clusters Back (WTCB) concept, and ask for feedback on your experience with it.

The concept of WTCB is to adjust the cM range to focus on two generations at a time. The idea is to “solve” the Clusters for close relatives and then adjust the range down to include Matches in the next generation back, and then see where the Clusters separate into more distant Clusters. Start easy with a range of 90-400cM which is the recommendation for the LEEDS method to determine four groups. This would be roughly four Clusters with each one focused on a separate grandparent. Tag (by Dots or by Notes) every Match to the appropriate grandparent. Then drop the range to, say, 70-200cM to get mostly Clusters that include Matches who are 1C on a grandparent, and 2C on a Great grandparent. I don’t know of anyone who has found a “sweet-spot” range for each generation, and I suspect it might be different for each of us. The last time I did this WTCB I had to “fiddle” with the ranges – and never could find any range that gave me only Clusters with Matches from only two generations in each Cluster. So, get used to that.

The point is to notice when some Matches you’ve tagged to an Ancestor, then show up in different Clusters based on a new range – and then determine which sides are represented by the new clusters. Then tag all of those new Matches appropriately.  Example: you have a Cluster with 20 Matches that is focused on a Great grandparent.  Tag all the Matches with that grandparent (if not already tagged as a closer Ancestor). Adjust the range to add more Matches. Look in the new Clusters for the previously Tagged Matches – hopefully there are two new Clusters, but maybe three. From my experience there may be two Clusters with 15 to 25 Matches, each of which include some of the 20 Matches from the previous run. These new Clusters would represent the next generation back and the focus would only be one of the two parents of the previous Cluster.

Yes, it gets harder and harder with each new generation. The good news is that a Cluster with known Matches from one generation, can only morph into Clusters going back from that one Ancestor. This reduces the genealogy effort . If you’ve reviewed all of your ThurLines (and used ProTools to add even more Matches), you have likely tagged a lot of Matches out to 6C. So as the 4C and 5C and 6C Clusters start to form (as you reduce the cM range), you may already see the Ancestor for the new Clusters by looking at the Notes.

Use your judgment, and fiddle with the cM ranges. Please report back on your experience, and/or if you find a sweet spot for some range. Note that the sweet spot should include two generations – the one you’ve figured out and the next one you are working on.

[19P] Segment-ology: Walk The Clusters Back at AncestryDNA by Jim Bartlett 20251017

Boundaries of a Triangulated Segment Part 2

Featured

Thanks to all for your responses to my last blogpost. All of them are a good read.

I had always thought a TG segment was crystal clear… WRONG. Per the classic refrain from the Legal Genealogist, Judy Russell: “It depends!”  My second ever blogpost on 9 May 2015 (Benefits of Triangulation) stated 16 benefits, including: Organizing most Matches into TGs; All Matches in a TG have the same Common Ancestor; the TGs define crossovers and a Chromosome Map; TGs are equivalent to Phased data. What I didn’t say explicitly is that each TG represents a segment of my DNA.

The elephant in the room is: who was the first Ancestor to pass down that segment (as part of a full chromosome passed to a child who is my Ancestor)? In other words, in what earliest generation did that full segment first exist in my line? There may be a  different such “elephant”  for each Match… but that’s another story.  

So back to “it depends”…. For me there are 3 objectives:

1. “See” my DNA segments. Divide up my chromosomes into discrete segments, each one of which came from a specific Ancestor.

2. Determine the Ancestor for each segment.

3. Determine my Chromosome Map of segments – each segment being adjacent to another segment from the beginning to the end of each of my 45 chromosomes.

When I started forming Triangulated Groups, I only worked with known cousin Matches. It created a patchwork of TGs. One day I decided to bite the bullet and Triangulate all of my segments, a company at a time (FTDNA, 23andMe and MyHeritage).  It took months without many of the tools we have today. And the three versions meshed virtually exactly! That was as expected since all comparisons were against my DNA. I was using the “full” version of a TG, plus some judgment for large segments from close relatives that spanned more than one TG. This brings me to a significant factor in Triangulation: Judgment.

Judgment: It’s easy to compare yourself to another Match and “see” an exact shared DNA segment. But what would happen if Match 3 in the last blogpost only overlapped Match 1 by 5cM? Would we then call this a 5cM TG (against the rules and throw the whole thing out?). Would we discard Match 3 (even if they had a robust Tree that included a CA)?

Judgment: Sometimes there is a close relative, Match 5, who overlaps much more than me and Matches 2 and 4. Experience (and judgment) tells me that this somewhat larger segment is probably from a close relative whose Common Ancestor with me includes a father/mother more distant – with one of them being the CA for the full TG.  

As I read over the comments of the previous blogpost, several words pop into my mind: context, messy, complex, judgment, imprecise, etc., as well as “we’re making this up as we go”.

Messy – yes Triangulating all of our Match segments against our own can be messy – and judgment is needed. Given the random nature of recombination, I do see some curve balls from time to time. Triangulation usually identifies false (IBS) segments, which should be discarded. If I find a shared segment that really messes things up, I’ll also discard it (or at least highlight it as weird). As I’ve blogged before, the raw data is sometimes messy – or fuzzy – sometimes reporting a shared DNA segment that runs longer that it should. Although my parents are not related (per GedMatch), I do have one area of my DNA that my two parents combined have all of the most common SNPs, and so I get a “zigzag” pileup of many Matches with false segments there. I’ve identified this area and then toss out those Match segments (<10cM). Pedigree collapse and endogamy also create messy areas. To the extent possible, identify these specific locations with a dummy segment to highlight the potential issue.

Context – in developing my Chromosome Map, the segments will be adjacent to each other.  I look for the previous and the following TGs to the one I am working on. Ideally (and actually) each of my segments will “crossover” to the next segment which is from a different Ancestor of mine. Note – that “next” Ancestor may involve a different grandparent, or a different 3xG grandparent. We have to fill out the Chromosome map to figure that out, but it is important to remember that the next TG will have a different CA. So if I accept the conservative TG (a part of the Match 3 shared segment), what different Ancestor can I find for all of the “leftover” shared DNA segment pieces of my DNA.

Complex – One complex part of this analysis is what about the parts of true segments from Match 1 and 2 and 4 that are not in the full TG I show in blue? I focus on my DNA, but I think every true Segmentologist should try this experiment with say Match 2 at GEDmatch. Use Segment Search to find other Matches who share the same segment and build the TG for Match 2 – it “will” be different than my (or your) TG. A little different or a lot different? If Match 2 is a known cousin, the same MRCA would almost always apply. By doing this with other Matches in a TG, many of us (working together) are building a larger segment of the CA.

Imprecise – I’ve blogged about fuzzy data. I counter this with judgment. I look at all the segment data for a TG (all my segments are in one spreadsheet). Among the TG fuzzy start data I decide on a specific Mbp start location. Then I decide on a Mbp start location for the next (adjacent) TG. Often some shared segments from the initial TG will “spill over”, past the start of the next TG. The small amounts of spillover, I just ignore: fuzzy data. If there is a large spillover, I’ll consider if the second TG is potentially closely related to the first TG, or not.

Imprecise – This also describes the fact that all your shared DNA segments may not “cover” all of your DNA neatly, or uniformly, or even completely. The shared DNA segments are independent and random – they are not at our beck and call…  They don’t necessarily help us fill the gaps perfectly. They are what they are – they are clues we must use as best we can.

All of the above is to indicate that all IBD shared segments should have a home in a TG, and that all the TG segments should cover all of your chromosomes, IMO. Remember, at each generation, all of your segments from that generation must add up to all your chromosomes!

Another aspect of this which I muse about is the SNPs – thousands of them in a unique arrangement in my DNA. Let’s say Match 1 shares 2,000 SNPs with me. Alone we would say the shared DNA segment between us (green) came from a Common Ancestor. Similarly we would say the 3,000 SNPs in the shared segment with Match 2 was from a CA. I don’t see how we could argue that these two CAs were somehow different. I think it is much more likely that the CA is the same, and Match 1 just didn’t get the full segment that I did and Match 2 did. Match 3 is in the middle of all these SNPs – surely Match 3 got the same SNPs from the overlapping locations. By comparing the SNP values of all 4 Matches, I’m confident that we’d find the same values at each SNP location.

Note: all of these Matches and evaluations are based on separated cousins. Of course close relatives could have the same segments and SNPs – the whole concept of segment Triangulation depends on an analysis of more distant relationships.

My summary:

The TG Group of Matches should all look for the same Common Ancestor – and hopefully help each other toward that goal.

The full TG segment (blue) is my DNA segment, which I can use as part of my Chromosome Map. It defines my crossover points. Also I can contribute my SNPs to any larger study of my Ancestor’s DNA.

I must be careful to not state that my Matches have this TG segment. Matches will have their own different, but overlapping, TG segment.

The Common Ancestor almost certainly passed down a larger DNA segment, through at least some of their children, which different descendants (including some of my Matches) got. Note: there may be other descendants who have DNA tested who may share with the TG Matches, but not me (I am not the center of the universe…)

[08Ab] Segment-ology: Boundaries of a Triangulated Segment Part 2 by Jim Bartlett 20250915

Boundaries of a Triangulated Segment

Featured

I presented “More Segmentology” today at the East Coast Genetic Genealogy Conference. I was questioned on a slide grouping segments into a Triangulated Group, and it appears there is a debate about this. I’d like to have your input on this.  

Here is my slide:

I show 4 Matches with overlapping Shared Matches with me on one side (parent). This is the definition of a Triangulated Group, which I showed in the bottom Chromosome – in green. What we can “see” is only the Shared Segments from Matches 1 to 4 in green.  I contend that Matches will rarely have segments that are exactly the same as my segment. So for the purpose of illustration, I guessed that their segments from our Common Ancestor was almost always different – that sometimes their segments started to the left of mine, sometimes to the right of mine; and sometimes the same ending, and other possibilities shown. In fact, I have tested this at GEDmatch where I could Triangulate with each Match as the base, and sure enough, they had their own, different, Triangulated segment. I went on to claim that my segment (from our Common Ancestor) started where the Shared Segments had their earliest start; and ended where the Shared Segments had their latest end – as shown in the green Triangulated Group segment above. The start and end of the TG defined my segment. Some others contended that the Triangulated Group segment should be shown as only the green that was common to all 4 Matches – like the space between the two vertical blue lines.

I don’t know of any Scientific Paper that defines the boundaries of a Triangulated segment. So I am interested in your perspective, and why.

[08Aa] Segment-ology: Boundaries of a Triangulated Segment by Jim Bartlett 20250914

A New Cluster on the Block

Featured

AncestryDNA has rolled out an “auto” Cluster program. I tried it and got 8 Clusters, ranging from 3 to 9 Matches in each one. A total of 40 of my 60 Matches above 65cM. The other 20 Matches were not included because they didn’t form a Cluster of at least 3 Matches. I know the Common Ancestors for each of the 40 Matches and the program clustered them 100% correctly. I’d give AncestryDNA an A+ for this new program. I’m impressed and anxious to have the ability to adjust the cM ranges downward to get more Clusters.

Some additional input on auto-Clustering.

It began in late 2018, with Genetic Affairs (by EJ Blom), and soon we also had Shared Clustering (by Jonathan Brecher) and DNAGedcom Client (by Rob Warthen). I tried all three. I had already done segment Triangulation on all my Matches at FamilyTreeDNA, and I worked with Johathan Brecher and we Clustered those same Matches. There was over 90% concurrence between the hundreds of Clusters and the hundreds of Triangulated Groups. Not enough to say the two processes were equivalent (they are not), but certainly this analysis showed a strong tendency of Clusters to point to a Common Ancestor between me and all the Matches in each Cluster. A very strong clue in each case.

I then Clustered all of my Matches at AncestryDNA – down to about 18cM. Many of the Clusters had a Common Ancestor consensus (easily seen in the Match Notes I had previously entered – many from ThruLines). So, I imputed that Common Ancestor to the rest of the Matches in each Cluster. I used Ahnentafel numbers to represent my Ancestors and developed a tagging code: e.g. #A0020. The #A means a confirmed Common Ancestor with a Match, and 20 is Ahnentafel for William MITCHELL 1824-1895. This code is the first thing in the Notes field. When I impute a Common Ancestor to a Match from a Cluster consensus, I use #L0020 – which means the Match is highly Likely to have that Common Ancestor with me. With a #A or a #L, I tagged almost all my Ancestry Matches over 20cM and many below that. This was in the 2019-21 time frame.

Recently, with ProTools, I’ve been able to determine how many more Matches fit into my Tree – and thus our Common Ancestor. For well over 90% of all these new Match cousins, the #L tag turned out to be correct – I only needed to change the L to A.

Bottom line 1: I am a big fan of Clustering at AncestryDNA and really look forward to expanding the coverage to more Matches.

Bottom line 2: Use ProTools with Clustered Matches to really nail down Common Ancestors to Matches.

[22DI] Segment-ology: A New Cluster on the Block by Jim Bartlett 20250725

Segment Triangulation Insight

Featured

Your DNA segments are from your Ancestors. They are adjacent to each other and fill up (or “cover” or paint) each of your Chromosomes. You have shared DNA segments with your Matches. With a browser, you can see your shared DNA on a chromosome – visually as a bar and by the start and end points in the data. Segment Triangulation lets us group overlapping segments and identify your full segment from an Ancestor. It also places each Triangulated segment where it belongs on one of your 46 chromosomes. Genealogy helps you decide if each segment is on a maternal or paternal chromosome. Once you do that, it’s then relatively easy to “fit” the Triangulated segments along each chromosome.  

Three key elements of Segment Triangulation:

1. A browser to give you the data – where is each segment on a chromosome.

2. Determine the segments are on the same chromosome (you have two of each chromosome – one maternal and one paternal). Several ways to do this…

3. Determine where one of your segments stops and another starts – i.e. the crossover points. A judgment call based on the consensus of the data.

A fourth key element is determining the MRCA for the Triangulated segment, and the path the segment took from the MRCA down a line of your Ancestors to a parent to you. This is mainly a genealogy task, working with your Matches and their Trees to build a consensus.

I hope this “insight” provides a clearer picture of what Segment Triangulation is all about and why it is a worthwhile process – for specific segments or all of your DNA.

[08F] Segment-ology: Segment Triangulation Insight by Jim Bartlett 20250525

Half-Identical Region (HIR)

Featured

Your DNA segments (that make up the 23 Chromosomes passed down to you from a parent) are not the same as shared DNA segments with a Match (as described by a chromosome browser) aka a Half Identical Region (HIR). All of your DNA is real, down to any size you want to analyze. This is not necessarily so for a shared DNA segment (or HIR)!

From the ISOGG Wiki: A half-identical region (HIR) is a region of two paired chromosomes where at least one of the two alleles from one person’s pair of chromosomes matches at least one of the two alleles from a different person’s pair of chromosomes throughout the entire region. A half-identical region may be either identical by descent (IBD) or identical by state (IBS).

In my words, for genetic genealogy, a computer compares your DNA test to a potential Match’s DNA test. The computer compares the two raw DNA data files – about 600,000 SNPs with two values (alleles) for each SNP. The two values are one from the DNA passed down from the father and one from the mother. The computer is looking for a long string of matching SNPs, which are then reported as a shared DNA segment. This meets the HIR definition above – at least one value is the same at each SNP in the shared segment. The theory is that, although much of our DNA will be the same, there is some variation, and a long enough string of matching SNPs will indicate this segment of DNA is from a Common Ancestor. This also implies that the long string is on one side – on one chromosome from our mother OR our father. A lot of reported genetic data indicates that such an HIR is true when it’s at least 15cM.

But why aren’t all shared DNA segments true? Because the computer algorithm blindly looks at *both* values at each SNP for you and the potential Match. The computer may create a string of your SNPs that agree with your potential Match’s SNPs, but some are from your father and some from your mother. Clearly this “zig-zag” result, using SNPs from both your parents’ DNA, is not a representation of your DNA on one chromosome. It’s not a DNA segment passed down from one of your parents to you. It’s a false segment! Or this might have happened with your potential Match’s data, or with both of you. Bottom line: wherever the “zig-zag” occurred, the shared DNA segment is false.

The good news is that this “zig-zag” result doesn’t occur with long enough segments – over 15cM. And it occurs very infrequently with 14cM shared DNA segments. And there is a rough distribution curve – probably different for each of us – which drops down to about half of our 7cM segments are false. And most shared DNA segments are false below 7cM – which is why they are generally not used. Some of the companies use other, proprietary, algorithms to discard (not report) some of these false Matches. Also, as I’ve blogged before, Triangulated Groups are very good at culling out the false segments.

This also ties into the ISOGG terms: Identical By Descent (IBD) and Identical By State (IBS), noted above. IBD would apply to true shared DNA segments – you and your DNA Match got the shared DNA segment from a Common Ancestor. IBS means the computer found a “match”, but IBS is usually used in genetic genealogy to indicate the false segments. I usually just stick to “true” and “false” shared DNA segments (or HIRs).

Another quirk in this discussion is using the term HIR to refer to a shared DNA segment.  This is proper and OK. But, an HIR only refers to a shared DNA segment between you and one particular Match. We virtually never find exactly the same HIR with two Matches (although it’s possible with Matches who are closely related to each other.) When we look at segment Triangulation, the Triangulated Group is comprised of different HIRs. So HIR should not be used to refer to a TG. A TG represents a segment of your DNA (from a specific Ancestor) – there are many different HIRs in a TG. And each Match in a TG would have a different (but overlapping) segment from the Common Ancestor, with different HIRs. Because the whole process is so random, we just don’t get the same segments from our Common Ancestors that our Matches get.

Bottom Line: A shared DNA segment is also an HIR – formed by a computer by comparing raw DNA test data (about 600,000 SNPs) with two values (alleles) for each SNP. Shared DNA Over 15cM all are true segments (IBD); below 15cM some are false (IBS). A shared DNA segment (aka HIR) is usually unique to a specific Match.

[22DH] Segment-ology: Half Identical Region by Jim Bartlett 20250521

HAPPY 10TH ANNIVERSARY

Featured

10 years ago, I blogged: “What is a segment?”, and noted the difference between an ancestral segment (your DNA segment) – passed down from an Ancestor to you; and a shared segment (created by a computer algorithm) which usually indicates a Common Ancestor for both you and your Match.

This is still the fundamental concept that is key to genetic genealogy.

We’ve looked at a lot of twists and turns based on this concept…

– How segments are measured

– Why the data is a little fuzzy, but that doesn’t negate its power

– How our DNA is passed down in identifiable segments from our Ancestors

– How each generation of our Ancestors contributes two full genomes (46 Chr) to us

– Why some of our segments must be sticky (persistent) for multiple generations

– How we “see” our own segments through shared segments

– How we can map (or paint) our segments on our chromosomes

– How shared segment “size” predicts relationships

– How we can group Matches by segment Triangulation or shared Match Clusters

– How we can use groups to solve brick walls, NPEs, Bio-Ancestors, unknowns

– Which ancestors always, or sometimes, or never have shared Matches

– Why all of our shared segments (6cM and up) may be important to us

– How to Walk Ancestors, Clusters, Segments back in our genealogy

– How spreadsheets can help us collect, arrange, analyze, QC, and use data

– How to use new tools: autoClustering, DNA Painter, browsers, ProTools, etc.

You have all been part of this journey of learning – as in fact, we are all learning from each other. I very much value your feedback and suggestions.

As some of you know, I also host DNA Special Interest Group (SIG), through the Washington DC Family Search Center. It was in person/local until Covid. We are now international via Zoom – 2nd Wednesday of each month 7-9pm ET. This is now an Advanced DNA SIG, and members are encouraged to participate and/or present (learn from each other). If you’d like to join, please email me at jim4bartletts@verizon.net

Happy Anniversary – your suggestions/observations/comments are “gifts” to us all.

[99F] Segment-ology: Happy 10th Anniversary by Jim Bartlett 20250507

SPECIAL ANNIVERSARY COMING UP

Featured

My first real Segmentology blog post was on 7 May 2015 – so an anniversary is coming up soon. I’m looking to consolidate and re-package the approximately 200 posts in Segmentology. If you would like any new or revised topics included, please feel free to use the comments or email me at jim4bartletts@verizon.net. NOTE: The Table of Contents (Outline in the header bar) has been updated, and all the posts are hyperlinked.

[99E] Segment-ology: Special Anniversary Coming Up by Jim Bartlett 20250422

MITx Class on DNA is Free

Featured

MITx offers a wide range of free, on-line, self-paced semester-long courses to anyone in the world. Coming up next week is Introduction to Biology – The Secret of Life. I’ve taken this course (actually twice). It’s taught by Professor Eric Lander – the founding director of the BROAD Institute and a principle leader of the Human Genome Project – and a fantastic instructor (his course is fun). This course is targeted at non-biology students. This is not about genealogy, it’s about DNA. Anecdote:  I was about halfway through the course, and one night my wife called out: “Jim, what are you doing – it’s 3 AM.” My reply: “I’m in a lab, folding proteins to capture a virus”.  If you are into DNA and Segment-ology, this is a great opportunity to get a firm grounding.  As a side note, I think MITx is a great undertaking and am a regular donor to that program. Free, world-wide MIT classes…

Here is a link: https://www.edx.org/learn/biology/massachusetts-institute-of-technology-introduction-to-biology-the-secret-of-life

Click on the short YouTube video… Enjoy.

[99D] Segment-ology: MITx Class on DNA is Free by Jim Bartlett 20250128

ProTools Part 25

Featured

The Path Is Key

This may be an extension of my “genealogy sacrilege” outlook or rant.

But before I begin, to each their own – you get to choose your objectives.

My two main objectives are to get my genealogy right; and to get the Chromosome Map of segments from my Ancestors at each generation right. My objectives do not include finding all of the descendants of all of my Ancestors. However, I do think that documenting how my DNA Matches interrelate to me and each other is very helpful in achieving my two objectives – and this swells my Tree somewhat. I’m finding: Match paper trail paths (and ThruLines clues) that are impossible, given the DNA evidence; and DNA evidence that has revealed genealogy paths I never would have otherwise found (not just limited to breaking through brick walls).

So, a lot of work to do to document what will be over 10,000 Matches…  Time is precious…

When documenting DNA Matches and their line of descent from our MRCA to them, the “Path Is  Key”. Dotting all of the “i”s and crossing all the “t”s is NOT! The DNA segments do not “know” their hosts’ names (or dates, or places), just that the segments are passed along. We genealogists document what we can about each of these Match ancestor DNA hosts. It helps us keep track – in time and place. But how much effort do we need to put into documenting our Matches’ lines? My opinion is: not much! We need to be sure of the path. We don’t need to know the full names, or pet names, or titles. It’s nice to know the birth/death years, but how much digging should we do to find the complete birth date or place? What do we do when several different descendants insist on different given names … I could go on and on, but I’ve decided it’s not my job to adjudicate their family “wars” – my objective is to be clear of the path.

Therefore, I’m now using terms like Pvt, Unknown, GUESS, sibling of XYZ, etc. to describe Match Ancestors – particularly those close to the Match.I don’t really care about their parent’s or grandparent’s names or genealogy info – just the path that must exist for a DNA segment. [NB: proving a specific genealogy-DNA link is a separate issue; a potential path is not a proven path.]

I am still documenting the child and grandchild of the MRCA (given name and birth year at least). But, IMO, the further down the path from the MRCA to the Match, the less precise this info needs to be. The Key Is the Path. I don’t want to introduce incorrect info, so I’m introducing “other” terms in the name field when it is unclear, in debate, or might take days to research and resolve. I note the “path” that has to be and move on.This allows me to get as many DNA Matches as possible into the spreadsheet. Then the interrelationships can be better evaluated.

SUMMARY:  Don’t worry about “fully” documenting the MRCA-to-Match path; just that the path does exist, and no incorrect info is introduced (unless your Tree is private). And, of course, it’s up to your own judgment as to if/how much of this recommendation to follow. My plan is to get as many Matches as possible into MRCA family groups in a spreadsheet, and then study the interrelationships with ProTools. Get Matches in my Tree and my Common Ancestor spreadsheet, but “do no harm”.

[22DG] Segment-ology: ProTools 25 – The Path Is Key by Jim Bartlett 20250222

ProTools Part 24

Featured

Small Segment Stats

Ancestry DNA Matches who share 6-7cM and have a known MRCA with me: 1,160.

Total Ancestry DNA Matches at any cM level: 7450.

About 15% of my DNA Matches with a known MRCA share only 6-7cM.

This is NOT a statement linking DNA and Ancestors.

This IS a statement about the many true cousins we will not see in our Match lists because the current threshold at AncestryDNA is 8cM.

I’m glad I Dotted and saved some of my 6-7cM Matches when Ancestry made the threshold change – it was a fraction of the total. I wish I’d have saved them all…

To end on a higher note – I still have 2,600 other 6-7cM Matches to work with – many of them are being determined as close cousins to known MRCA Matches by using ProTools.

[22DF] Segment-ology: ProTools Part 24 – Small Segment Stats by Jim Bartlett 20250221

ProTools Part 23

Featured

Integrating With Genealogy

ProTools is a powerful tool. But it has it’s limits. 1C and closer relationships are very accurate, in my experience. Beyond that, the range of possibilities grows quickly as the cMs fall below the 1C range. But think about what that means… A 1C relationship takes us back to our grandparent level. Think of a 20 year old genealogist with a 50 year old parent, and 80 year old grandparents. Those grandparents would be in the 1950 census. And the census is a pretty good tool back to 1850 – another few generations. You might argue that the census is not rock solid in every case. There may be adoptions, NPEs, etc. That is true, but those individuals will not show up as DNA Matches – for the most part.

Yes, there are still a few situations that may slip through. But on the plus side, the census and ProTools will sort out a high percentage of false relationships, and/or incorrect genealogy “research”.

Used together, the census and ProTools can pretty accurately cover the past 175 years.

[22DE] Segment-ology: ProTools 23 – Integrating With Genealogy by Jim Bartlett 20250131

ProTools Part 22

Featured

A Rant about Relationships

I praise Ancestry for ProTools – just about everything about it is great. I have often reported how accurate the close Relationship Estimates are. I rely almost 100% on 1C and closer relationships; and have found many 2C relationships to be correct. I worked for several days on a 3C relationship – knowing the Trees of the two Matches pretty well – to no avail. This is becoming a regular occurrence.

I’ve noted over the past year, Ancestry has tightened up their Relationship Estimates – all are now within 4C. We can tag a Match at 4C or closer, or Distant. A far cry from the Circles where Ancestry showed us how we were related out to 8C; or even the current ThruLines out to 6C.  Will they change again, tomorrow, to only showing Matches related within 4C or closer? I am long since past that threshold…

So I decided to take a deeper dive, under their hood, to see what they predicted for small cM Matches. I randomly selected a 6cM Match that I had saved. She was predicted to be Half 3C1R or 4C – evidently their deepest estimate. So I clicked on that estimate to get their more in depth analysis. Here are two screenshots of their analysis [sarcasm: based on results from their 27 million testers?]:

It seems to me they have adopted the “Cinderella Principle” – push hard to fit the data into a desired result. Are they really claiming that 99% of all Matches at the 6cM level are a 4C or closer? The Ancestry folks are much smarter than that…  They know better, and, for some reason, AncestryDNA is distorting the truth! SHAME! Our tens of thousands of small cM Matches do not fit into a size 4C Cinderella slipper!!

Bottom lines: still rely on 1C or closer relationships for analysis with ProTools; IMO, beyond 2C, treat the estimates as garbage; let me/us know if you have some insight that I’m missing (other than something related to greed).

[22DD] Segment-ology: ProTools 22 – A Rant About Relationships by Jim Bartlett 20250119

Pro Tools Part 21

Featured

Adding a GUESS

Setup

gk (Match1) is known 5C1R – with grandmother: Anetta b 1926 m SURNAME1 > father: private Male > gk; AND gk has 10 known 2C to Anetta’s father (in the line going back to our MRCA).

Justin (Match2) shares 898cM (estimated 1C) to gk; and has a very small Tree of Private Ancestors.

Analysis

To be a 1C to gk, Justin would need to share grandparents with gk – either gk’s paternal grandparents or gk’s maternal grandparents. From the setup (above), we know the maternal grandparents are SURNAME1 and Anetta b 1926; we don’t know (but can often find) gk’s paternal grandparents. In this case there wasn’t enough info in Justin’s Tree to help.

However, there is another way to determine which set of grandparents Justin descends from. If he descends from Anetta’s side, Justin would also be 2C to the 10 known 2C that gk has (NB: all 2C match each other). If Justin descends from the other grandparents of gk, it is highly likely that Justin will NOT share any of the 10 known 2C to gk.  A quick look at Justin’s Shared Match list, shows he matches ALL of the same 2C that gk has. Justin is clearly a 1C to gk on gk’s maternal side – which is the side back to the MRCA with me!

Therefore, I am very confident in adding Justin to my Tree with UNKOWN parent and KNOWN grandparents: SURNAME1 and Anetta b 1926. The rest of the path gk has back to our MRCA is already in my Tree.

This places another Match into my Common Ancestor spreadsheet and into my Tree. It takes this Match off the list of unknown (aka Mystery) Matches. In Shared Match lists, Justin will now show up as a known (Dotted) Match – reinforcing Clusters. I don’t know if Justin’s addition to my Tree will help AncestryDNA with future ThruLines evaluations, but I hope so. I *know* it will help me.

A similar analysis can be made for a Pro Tools estimate of 1C1R or a 2C, but it gets less reliable with each additional degree of separation. There is also a higher degree of difficulty in the analysis, because the certainty of the cousinship estimate is not as assured and the number of possible alternatives that need to be addressed increases. It’s often not impossible, but it is harder. A strong factor is whether a *candidate* Match shares a lot of the same Shared Matches. In other words, if the candidate Match clusters with a lot of the same Shared Matches (which can be observed in the Shared Match list), to me that is a strong indication that candidate Match has the same MRCA. This needs to be tempered with endogamy or pedigree collapse – judgment is needed in those cases.

[22DC] Segment-ology: Pro Tools Part 21 – Adding a GUESS by Jim Bartlett 20250109

Pro Tools Part 20a

Featured

A Plan and some TIPS (corrected)

At the end of 2024, I wanted to review my Plan for using Pro Tools (and filing in a Common Ancestor Spreadsheet) and highlight some TIPS .

For the long haul – addressing all of your genealogy using Pro Tools – make a Plan! Perhaps a New Year’s Resolution…

I now think the best plan is to start with the closest Ancestors and work back a generation at a time.

That is, start with your grandparents –two grandparent couples [Ahnentafels 4 and 6]. The Matches at this level would nominally be 1C to you – maybe some “removed” – like a 1C1R or 1C2R – particularly as we get older:>( There are only two groups at this generation – one on the paternal side and one on the maternal side. So, two CA-Couple headers in the CA Spreadsheet. For each row under a header row, enter the Match information (name, cM, # segments, cousinship) and then the child of the CA and their birth year, and then the path to the Match.

TIP1: for each, and every, Match I list, I use Pro Tools to show *their* closest Matches – these are often close Shared Matches to them that can be figured out even if the SM has no Tree.

TIP2: for each Match I list, I add them (and their path to the CA) into my Tree (and apply the DNA-connection and/or DNA-Match Tags). I don’t know if the Tags help AncestryDNA build Trees or determine ThruLines; but it does help me when I run across them days/weeks/months/years later. Not necessarily a *certification*, but at least a reminder that I’ve reviewed the path before.

TIP3: Fill in some Notes for the Match – I always start with my CA code – example: #A0064P [the A means I’m satisfied the Ancestor is correct; and the # is a holdover from the days we searched for unique strings; the 0064P is Ahentafel 64 on my Paternal side [in a DNAGedcom Client Spreadsheet Report, I can sort on the Notes column, and they will group in order]

TIP4: I Star & MRCA Dot & Tagged-in-my-Spreadsheet Dot each Match – this unique Star-Dot-Dot “trio” clearly highlights Shared Matches who are already in my CA Spreadsheet. In a Shared Match list they help identify a Cluster.

TIP5: Each of the Matches under an MRCA Couple at this generation should match each other. They are 1C, 1C1R, 1C2R to you and each other, and all should Match. A Quality Control Check. [NB: I am tempted to add in any Aunt or Uncle Matches to my Spreadsheet; but they may be close to the Match, but not on the path to my Ancestor – when that happens they won’t have close cM ties to the other Matches.]

TIP5: I have a separate column in my CA Spreadsheet to indicate I’ve done all of the above. I’ve got about 8,000 Match rows in my spreadsheet and I’m reviewing each one to make sure I’ve covered all of the above and then check it off. As it turns out, some have changed their Trees, some have dropped out of Ancestry, Ancestry continues to update ThruLines, etc., etc. This checkoff indicates a fresh update.

Time now to tackle the four Great grandparent couples [Ahentafels 8, 10, 12, & 14].

Repeat the steps above for each of your Ancestor couples.

Note that TIP5 still applies – under each couple the Matches are 2C (or 2C1R, 2C2R, or maybe 1C1R) with you. These nominal 2C should all be close cousins to each other (sharing large amounts of cMs)

At any point in this process, take a break and chase down a rabbit hole or two. But then come back to this methodical process.

TIP6: Using this process, makes us treat all of our Ancestors equally. I tend toward favorite Ancestor lines, and this process forces me to grind through all of the Ancestors and Matches.  It’s a good thing.

A slight change occurs at the next generation [eight 2xG grandparent couples; 3C level; A 16 – A 30] At this level, TIP5 breaks down a little.

TIP7: Reminder – 2C-100%; 3C-90%; 4C-50%; 5C-15%; 6C-5% – (roughly)… This is the “curve” indicating how often true cousins will be a DNA Match to each other. ALL true 2C will be a DNA Match to each other. Of 10 true 3C, each one will usually have a DNA match with only about 9 of the others; but each of the 10 will have about 9 of the others matching – so these 10 would still form a pretty strong Cluster… Among a group of 4C, each one will only match about half of the total; and they may not all form one, strong/compact Cluster. And it gets worse, at the 5C and 6C levels… – some interconnecting cousin Matches, but not strong Clusters. However, now with Pro Tools we can find groups of strongly interconnected (closely related) Matches – strong ties to each other, but perhaps their strong subgroup is 5C to 8C with you.

At the 4C level, I see interconnected groups around the children of each grandparent couple; and sometimes a few interconnections between children. At the 5C level, as expected, I’m seeing groups (Clusters) form on the grandchildren.

Additonal TIPS

TIP8: multiple marriages; non-marriages: IF you and a Match only share DNA through one Ancestor, then your relationship is “Half”. Pro Tools often includes cMs for Half relationships, but these only apply with when you share only one Common Ancestor.

TIP9: Some Matches may be related to you multiple ways – give them a separate row (and Ahnentafel #) for each relationship. NB: If you are 3C on A16 and on A18 the odds are equal – with one segment, it could be either; with multiple segments, it could be both… However, if you are 3C on A16 and 4C on A38, with one segment, the odds are 4 to 1 that the DNA came from A16; and if you are 3C on A16 and 5C on A76, the odds are 16 to 1 that the DNA came from A16. This is because *shared DNA* is divided by 4 with each generation, on average. If you have shared DNA with a Match, it’s much more likely to be from the closest relationship.  

Please post in the comments if you have good TIPs that would help us all.

Happy New Year!

[I fixed the error in Tip 7, and reposted]

[22DB] Segment-ology: Pro Tools Part 20a – A Plan and some TIPS by Jim Bartlett 20250101

Pro Tools Part 19

Featured

Comments on Sacrilegious Genetic Genealogy

I thought these comments were excellent and wanted to share them.

Guest Post from Terry Butcher dated 11 Dec 2024

In regards to your Pro Tools Part 16 Sacrilegious Genetic Genealogy post, I would like to share some thoughts on the topic.

While I appreciate the power that various DNA analysis techniques offer in identifying clusters of matches to specific common ancestors, my primary focus has always been about the genealogy side of the effort.

I feel that I need to connect my tree to each match to really have anything of value. I already accept that I am related to my matches (within the parameters you have described related to cM size). Being able to document the relationship and share it with my matches is my reward for investing the time and effort in researching them.

I try to make a connection with each match and approach each one as an opportunity to learn something new. Each match that I find a common ancestor for in essence validates that specific branch of my tree by having both a paper trail and a DNA match.

I add my matches tree into my tree as I research them. I start by adding them as an unrelated person in my tree and start working back along their tree picking up all of their branches until I either find a common ancestor, hit a dead end or believe there is no longer any possibility because of location has gone back to Europe. It usually doesn’t take long to find most CAs. While researching a match, I usually only add parents and the child, ignoring the other siblings to save effort. However, if I am successful in finding our CA, I will usually go back and pick up the other siblings for several of the most recent generations.

I have been systematically working my way through my matches starting with closest related and have made it down to the 41 cM matches (about 2,000 so far).  If the match has useful information in their tree, I have been successful about 90-95% of the time. In the past, I would contact matches without trees and offer assistance. Now with Shared Matches Pro I am able to find their close matches with trees and sometimes find a CA. This is much welcomed capability that changes what is possible in my research.  I have a total of 132k matches now with 11,500 marked as 4th cousin or closer.  It would take me many, many years to even get through the 4th cousins and closer matches so I am not worried about running out of matches to research that I have an excellent chance of finding a CA.

For the 5-10% of my matches that I build their tree but can not find a CA, I suspect they may be either connected with 2 brick walls that I have at 3rd GGF or some unknown adoption or incorrect parent in my tree. Several of these unsolved CA matches now tie together in their trees and I am hopeful they will eventually result in solutions.

By working through my matches and incorporating their trees into my tree, I have expanded my tree significantly to over 222k people now.  As nearly all of my ancestors have lived in WV since the early 1800’s, my tree is heavily weighted with WV families. I typically don’t have to add but a generation or two until I find my CA.

I am not concerned about having floating tree branches as I believe they will eventually connect into my overall tree.  Anytime I encounter a common surname in my research, I chase it back until it connects with other members of that family which strengthens the connections in my tree.

I value the ability to generate family tree reports showing the relationship path between my match and myself and always share the typically one-page report with my match by saving it to my Dropbox folder and sharing a link in the message I send them.

Any match that I can connect to my tree to a CA has over 10k ancestors (and their descendants) with many up to 40k.   

My approach over my 30 years of genealogy as a hobby has evolved as it has for most I suspect.  As I research, I pick up as much information as I can including photos, obituaries and sometimes other records like draft registration documents, marriage and death certificates. All of these documents are incorporated into the detailed reports I generate whenever the person is included in the report which makes for some very interesting reading for my matches when I share reports with them. I find that Ancestry provides 98% of my information with a bit of help from the other sites whenever I hit a dead end in Ancestry.

[22DA] Segment-ology: Pro Tools Part 19 – Comments on Sacrilegious Genetic Genealogy by Terry Butcher 20241211

Pro Tools Part 18

Featured

Family Group Sheets

One of the key features of my Common Ancestor Spreadsheet (see post here) is that it offers an arrangement like a traditional genealogy Family Group Sheet (FGS). The FGS has an Ancestor couple at the top of the sheet, with a list of their children down the page with birth, death, marriage dates and places. If we are going to create an inventory of our DNA Matches with known links to an MRCA, this FGS spreadsheet format would be a great way to do that. It also turns out to be a handy tool when working with Pro Tools.

The Common Ancestor spreadsheet for Match cousins is actually a “nested” FGS. By sorting on Ancestor Ahnentafel Numbers, all the Matches connected to one Ancestor are grouped together. By also sorting on the birth year of the Ancestor’s children, this “FGS sort” results with Matches grouped under each child. By adding sorts on birth years for grandchildren and great grandchildren, we get a “nested” FGS. I regularly use my entire spreadsheet sorted by these four columns.

This arrangement has several advantages when using Pro Tools…

1. When Pro Tools indicates a parent/child or sibling relationship to an existing Match (already entered into the spreadsheet), I can create a new row and copy most of the info and just adjust one column – a real time saver. And this works even with new Matches with No Tree, Private Tree, Unlinked Tree, Scrawny Tree, even small cMs – Pro Tools has already provided all the relationship information needed.

2. When Pro Tools indicates a (full) 1C relationship to an existing Match, this limits the relationship possibilities to only two. [In my experience, 1C estimates are highly accurate.] Analysis: the new Match is connected to the existing Match (already in the spreadsheet) on (1) the same side I am on, or (2) on the other side. Be aware of this! If the new Match is on the “other” side, they are NOT part of this Ancestor (Ahnentafel) line. If the new Match has any info in a Tree, this “side” issue can usually be figured out and the spreadsheet cells filled out (mostly by copying from the existing Match). If there is no Tree info, the “side” can usually be determined by looking at the Shared Matches of the new Match (sorted on new Match’s cMs). There should be a clear consensus (at/near the top of that list) of the same Ancestor line as the existing Match.  If not, then skip this new Match. If so, I add a row for the new Match, copy data from the existing Match, and enter GUESS for the new Match parent (as a sibling of the existing Match parent), and then the new Match [NB: to save typing, I indicate each “terminal” Match as an asterisk (*) because they are already spelled out in the Match column near the beginning of the row.]

Analysis summary: A) look at their Tree; and/or B) look at their closest SMOMs.

3. For a 1C1R or 1C2R the estimates are still very good, and the process above can be used. Use available info or judgement to shift the new Match to the right or left per the “removes”. Where the individuals are not known, just put Unknown or Private in the cell. The complete path down to the Match is not critical, IMO.

4. When Pro Tools indicate Aunt/Uncle or Niece/Nephew, that too is highly accurate, as are the genders. Similar to the above, there is usually enough information to place them in the spreadsheet (which is like a horizontal Tree).

5. Pro Tools often includes a Half relationship in their estimate. This is based on tables that indicate two estimates shown are almost exactly the same cM range. Although technically correct, it is much more likely, IMO, that the relationships are standard (NOT Half). But a few will be Half so watch for that situation. Remember these Pro Tools cMs are between your Match and the Shared Match (not affected by whether or not you have a Half relationships with the Ancestor)

6. Adding a hitherto unknown child branch – best described by a recent example I had. In looking for my A38 (ALLEN ancestor) cousins, I found a bunch descending from four well documented children of A38 – 56 Match cousins (4C, 4C1R, 4C2R and 4C3R) with an average of 18cM. There appears to be more than four children in the 1810-30 Virginia census records. And there was an old story about this family, that a son named William went west. So when some known Matches had some SMOMs with ancestor William H ALLEN born 1815 in VA and living in IL, I took notice – it seemed to fit. As I pushed it with Pro Tools I found (so far) 10 Matches descending from William H ALLEN averaging 20cM. But more importantly, those Matches also had Shared Matches with 12 of the 56 Matches from other children from this A38. It sure looked like a Cluster with gray cells to other Clusters! I’d really like to determine William’s Y-DNA; and/or some DNA segment data… But, in the meantime, I’ve got two of William’s descendants checking their Matches for links to my A38 ALLEN. There are 147 Trees at Ancestry for William H ALLEN – not a one has any good clue to his ancestry, except that he was born in VA. Not my Brick Wall, but I think there will be 147 happy campers.

A key point in this long story, is the DNA has no sense of geography. The facts that four children stayed in VA (and were well known) and one child moved far away, made no difference to the DNA. From each descendant’s viewpoint, all the lines were equal – and a pretty even distribution of Matches showed up for all 5 children. The DNA is like blind justice.

7. Equality – a final thought is that this spreadsheet is a lot like the DNA – it’s relatively equal over all the Ancestors and descendants. This spreadsheet encourages me to treat all of my Ancestors equally (they each have an Ahnentafel placeholder row). I still have my “favorite” Ancestors, but as I methodically go through the spreadsheet, I’m spending time on each one. This includes the Ancestors that have issues… This spreadsheet also highlights the Brick Wall holes, to be plugged with floating family branches. This is a good thing.

To me, the key points in doing this spreadsheet work also include:

1. An inventory of Matches who have MRCAs with me. Separate from my on-line Tree. Saved in the cloud and/or archived – available to my heirs or selected genealogy archives someday.

2. Family Group Sheets – of sorts* – this is a standard genealogy tool.

3. A Quality Control check on the accuracy of name spelling and birth years; and the FGS itself. This QC review often reveals “quirks” (as a kinder word) that folks have in their Trees…

4. With Ancestor second marriages, this FGS listing will show the demarcation between full cousins and Half cousins. [I add “INSIGHT” rows with marriage years that will sort and separate the children to the different parent couples.] Half cousins for me only occur at the children level in my spreadsheet. Half cousins between Matches and Shared Matches can occur anywhere.

5. A re-sort by Match name highlights multiple relationships. Since shared DNA is divided by 4 (on average) going back each generation, the closer relationships are much more likely. I’ve found some Matches with MRCAs on both sides of my Tree. With single shared segments, the DNA can only come from one Ancestor. With multiple shared segments, there may be a segment for each line.

* I used “of sorts” in 2 above, because this FGS will not usually be a complete list of all Ancestor children, grandchildren, etc. It includes only the ones who provided a DNA path down to our Matches. Which in turn depends on family sizes and who did DNA tests – there can be wide variations on both.

Note: If I were starting over, I’d probably add name & birth year columns for 9 generations – out to 8C level; and then a catch-all column for any additional info. This would provide a handy way to evaluate the cousinship levels. Reminder: I only list the given name and one initial for males; and the given name, initial and married surname for females. I try to keep it as easy and simple as possible.

Bottom line: An FGS spreadsheet offers an easy way to add new Matches which have been identified by Pro Tools as closely related to known Matches. This adds independent, genealogy triangulation and tight Clusters to an inventory of known Matches. It will be an outstanding adjunct to an auto-Clustering program.

Also – you don’t have to use a spreadsheet to benefit from most of the concepts imbedded above.

[22CZ] Segment-ology: Pro Tools Part 18 – Family Group Sheets by Jim Bartlett 20241209

Pro Tools Part 17

Featured

NPEs

If we just consider our own ancestral line, we may miss some NPE’s. We may have an NPE as an Ancestor, IF we haven’t explored the whole family.

Way back, NPE was Non-Paternal Event, but we’ve seen non-Maternal events, too. So we changed it to Not the Parent Expected. The whole issue centers around the expectation of a family with two “expected” parents. Important: an NPE is usually for one child – perhaps your Ancestor; perhaps a different child in the family. We “expect” all the children in a family to be from the husband and wife.  So “usually” an NPE is a one-off event. But life unfolds in many different ways…

A man and a woman create a child – sometimes one of them is not married (i.e. living with their parents, or on their own) – or perhaps this is the case for both of them. Sometimes they are both married to someone else. Sometimes the man is not (or ever) aware the woman got pregnant. Again – in life, there are many variations to this. The point is the NPE does not apply to a family – it applies to a child. This is important to DNA analysis, and how we use Pro Tools.

I have this case for one of my Ancestors. The pregnant woman was an unmarried child in a family who raised her and her son, giving him their surname (which has confused genealogists to this day). It appears the father was not yet married either, but he went on to marry and have children. I know because I got some DNA from him (through the NPE child) and have Matches who descend from him through his other children (half cousins), and though her children by her later marriage (half cousins). [NB: Challenging in my Common Ancestor spreadsheet.]

Getting back to Pro Tools – the DNA truth-teller/helper. In general, the higher-cM SMOM interrelationships lead to one generational level in my Tree – to one MRCA couple. They may be cousins 1 or 2 or 3 times removed (because I’m old), but usually all go back to one MRCA. Then, as I scroll down the SMOM list, I often find SMOMs who descend from one generation further back. This is normal and expected. These would be a generation more distant to us, and should have appropriately smaller cMs, on average. In fact, if this doesn’t happen, we should be suspicious.

NB: Alternatively, some highest-cM Matches may be tied to a closer generation (which should be, on average, a higher-cM relationship). If these higher-cM Matches are at the same generation level, it may be due to multiple segments and, perhaps, additional relationships (with Colonia Virginia ancestry, I sometimes find multiple relationships with some Matches).

Finally, back to NPEs… If one of the Ancestors in an MRCA couple is an NPE, you wouldn’t get any Matches to that couple (just like with an only child; an exception would be if they had more than one child together).  So, instead, look to see that *some* of the Matches are from each bio-parent.  This is how I solved a Brick Wall. I had many Matches to my A36 (4C level) Ancestors [Thomas NEWLON & unknown wife]. As I kept looking at the Shared Matches, I found some smaller-cM Matches to my A72 (5C level) couple [Thomas NEWLON’s parents] who had been well researched. Analysis of “other” Shared Matches revealed many had the CUMMINGS surname (now my A74; 5C level ancestor).

The point is that if Pro Tools points to a group of higher-cM Matches to a 3C, 4C, etc MRCA; the lower-cM Match should point to groups for the next two MRCAs back. This is true whether these MRCAs are well known or an NPE or a Brick Wall. If you find a consensus Ancestor among these smaller-cM Matches you may have found GOLD.

Bottom Line: When dealing with an NPE, think carefully about what that means to Pro Tools, and target your “rabbit holes” appropriately;>j

[22CY] Segment-ology: Pro Tools Part 17 – NPEs by Jim Bartlett 20241208

Pro Tools Part 16

Featured

Sacrilegious Genetic Genealogy

For this post I want to explore a deviation from the normal genealogy and DNA research “requirements”.

Do we need to do comprehensive research on each cousin Match? Do I really need to find the complete link between each Match and our Common Ancestor? The sacrilige: do I care about all my distant cousins – to the extent that I must develop their complete link to me? Do I really care how much DNA they share with me? Must I link the DNA to the Common Ancestor? Or, is it enough to determine that they are on a specific branch of my Tree? I think so!

My standard mantra: our bio-Ancestors and DNA segments are set! We compare each Match to our Tree and DNA to find a Common Ancestor. I’m very close to finding out how 10% of my 100,000 Matches (at Ancestry) are related to my bio-Ancestors.

My experience with Pro Tools indicates many more can be easily found. I acknowledge that some shared DNA segments under 15cM will be false – but that doesn’t mean those Matches aren’t related to me.  Most of our true cousins beyond 3C will not share any DNA with us, so is the cM amount beyond 3C meaningful?  I acknowledge that some Matches will be related beyond a genealogy timeframe.

However, given these negative factors, I believe a lot more of my Matches are related to me within 9 generations back [8C level] – perhaps somewhat more than 20% of my total Matches. It’s taken me 14 years to “collect” and document approximately 10% of my Matches as cousins.  It’s daunting to think what time and effort I’d need to double that.

My sacrilege is to give up on full genealogy research for each Match. Using Pro Tools I’m finding lots of 6-10cM (small segment) Matches (to me) that are children, nieces/nephews, or 1C to strong higher-cM Matches that I have placed in my Tree. Clearly, these Matches are part of a family group well within a genealogy time frame.

I’m inclined to just quickly:

1. Add these small-segment Matches to my Common Ancestor spreadsheet

2. Add a Match Note (at Ancestry) to indicate the Common Ancestor and/or Ahnentafel [e.g. #A0062]

3. Give them my standard star and MRCA Dot; but not the Dot indicating a linked Match

4. Use a new Dot to indicate “Likely” in a family group under the MRCA; but not complete research [I could always filter on that Dot later, and do the research, some day…]

5. Add a shorthand note like:  SMOM: 3,442cM/son of “Match Name” [SMOM: Shared Matches of Match – the cM between them]

I’m looking for a more efficient way to group Matches into known family lines.

There are several points here:

1. Identify additional Matches within a genealogy timeframe (is it over 50% of all Matches?)

2. Group Matches under my Ancestor Couples – often under a specific child or grandchild (why would I need to dig deeper – unless the Match had a robust Tree with many records…)

3. Build a firm interrelated framework for later research on each extended “twig” of my Tree. Get some confidence of my Ancestors and their children and grandchildren.

4. Identify Brick Walls through clear absence of interconnected Matches. My spreadsheet has an Ahnentafel header for each of my Ancestors back to the 8C level – some of them have no known Matches, or what is clearly a small mess of non-interconnecting Matches. These are a judgment call, but with many more Matches involved, these few “problems” become more and more obvious.

5. Connect Floating branches – I now have several strong “clumps” of interconnected Matches, under a single MRCA couple, that I cannot link to my Tree. This is a strong hint in light of #4 above. I plan to explore this more in a separate blogpost.  

For DNAGedCom, Genetic Affairs, DNA Painter: Any way to automate the Clusters/Groups to include only those Matches who interrelate, say, over 90cM (and make that threshold adjustable)?

Bottom line: I think many more , if not most, of our Matches will turn out to be real cousins within a genealogy timeframe (out through 8C level). This includes Matches with no Trees, Private Trees, UnLinked Trees and scrawny Trees – all of these are now put into the mix through Pro Tools. For me, compiling data from my 100,000 Ancestry Matches will be a way to bound (if not counter) the continued warnings that many of our Matches are false and/or distant. Some are, some are not – what can we learn?

As usual, I value your feedback – on the sacrilege of adding Matches to Tree branches based on strong interrelationships, but without fully documenting the genealogy; as well as the bigger picture of possibly linking Floating branches to “bare spots” in our Trees.

[22CX] Segment-ology: Pro Tools Part 16 – Sacrilegious Genetic Genealogy by Jim Bartlett 20241205

Pro Tools part 15

Featured

Shared Match Cluster Hints

I’ve written in this Pro Tools series about the power of Shared Matches. They form manual Clusters of Matches. Like all Clusters, they *tend* to point to a Common Ancestor. Each individual Match has their own ancestry, and they may relate to us in several different ways (particularly with my Colonial Virginia ancestry). With auto-Clustering this is displayed by placing the Match in a Cluster with the strongest ties to other Shared Matches – and using gray-cells to indicate ties to other Clusters. This shows up in a Shared Match list with a mix of Shared Matches tied to one Common Ancestor, along with other Shared Matches who may be related in different ways, and even some Shared Matches who might not be interrelated at all.

So, to make a point: Shared Match Clusters (or concentrations in Shared Match Lists) should be considered as a Hint. The stronger the consensus, the stronger the Hint. The chore that still remains is tracing the genealogy from the Match to a Common Ancestor(s).

I find that consensus is a judgment call. But when I make that call, I usually find other Matches with a genealogy link as expected. But not always…

Segment Triangulation is fairly precise – each of our DNA segments came to us from one particular ancestral path. Shared Matches (aka In Common With, aka Relatives in Common, etc) are not equivalent to Triangulation. When Shared Matches form a Cluster, it’s a strong Hint. And a 20×20 Cluster is much stronger than a 3×3 Cluster. And a 20×20 Cluster where each Match matches almost all of the other Matches is very strong, compared to a 20×20 Cluster where each Match only matches, say, half of the others… I have found large, strong Clusters (beyond close cousins) usually turn out to include one TG (maybe two), but there is no hard rule.

Summary: Shared Matches can grouped into Clusters. Clusters are not the same as Triangulated Groups (TGs), but they can be good pointers and helpful Hints.

[22CW] Segment-ology: Pro Tools Part 15: Shared Match Cluster Hints by Jim Bartlett 20241125

Pro Tools Part 14

Featured

Jigsaw Puzzles

Our genetic genealogy is very much like a jigsaw puzzle. Our Ancestors and our DNA segments are both pieces of a large jigsaw picture (ourselves). Soon after the moment of conception – when sperm meets egg – our DNA segments and crossover points are determined. And, of course, our Ancestors, each with 2 biological parents, are determined. There may be lots we don’t know, but those configurations (DNA and Ancestors) are fixed – waiting for us to discover them. Just like a box of jigsaw puzzle pieces, all the pieces  are there – and they only go together one way (like our DNA segments and our  Ancestors).

Now think about our DNA Matches – perhaps 100,000 of them – as we open our list…   The overarching concept is that a Match sharing at least 15cM with us is always a true (Identical By Descent or IBD) relative; and over half of the remaining Matches will also be IBD and a true relative. Of course, some of these Match-relatives will be distant cousins.

Based on my deep dive with Pro Tools, I’m now convinced at least 20% of my DNA Matches at Ancestry are relatives within a genealogy time-frame. I’ll go out on a limb and say 8C or closer!.

So, to the point of this blog post… 20,000 of my 100,000 Matches are probably 8C or closer. Each one of them is a jigsaw puzzle piece. Each one interlocks with me (sometimes in multiple ways) and very often with other Matches (look at *their* Shared Match list). In many cases they form interlocking relationships with each other, from siblings to parent/child to 1C and 2C and 3C interrelationships. Just like a jigsaw puzzle. Some will be like the jigsaw lake, or forest, or barn or road – all of which “clumps” of the puzzle will eventually integrate – only one way – into the grand picture….

With Pro Tools’ new Sort feature (the Shared Matches’ *close to distant* Sort), it’s a whole lot easier to form small branches. Think of it this way…. You have 1,000 Matches, and you can easily find links that result in 500 pairs….  In a flash, you’ve cut your workload in half. And as you form larger clumps of Matches – all of your Matches in that clump must lead back to you! Put another way, look at the clump and see where all of your Matches have a Common Ancestral line – out of the clump and directly into your Tree – somehow…

The jigsaw puzzles:

  1. The Ancestors must interlock in pairs and form an entire “Tree” jigsaw picture>
  2. The DNA segments must array adjacently and form a Chromosome Map picture
  3. Our Matches will interlock with us; each other; and our Ancestor Tree.

[22CV] Segment-ology: Pro Tools Part 14: Jigsaw Puzzles by Jim Bartlett 20241124

Pro Tools Part 13

Featured

Status of Common Ancestor Spreadsheet

I have a spreadsheet of all Matches with Common Ancestors with me. It includes my Ancestors and their children down to each Match. See more at https://segmentology.org/2021/12/19/segmentology-common-ancestor-spreadsheet/ It’s a lot of work, so feel free to adapt it suit your needs.  

I have been reviewing all of these Matches and adding a LOT more using Pro Tools. I posted various ways to do this here, and I’ve gone down all those rabbit holes. I’m now on a march to review these Matches methodically – from closest Ancestors to more distant. I’ve found that it’s essential to have “known” Matches highlighted in Shared Match lists to speed the process of determining new Matches with CAs and forming family groups. So I’m adopting a two phase process. First: Recheck all Matches for firm relationships and having a clear set of Dots that will spotlight them in a Shared Match List – probably out to 5C level; Then: I’ll go back and use Pro Tools to tease out new Matches to add in.

Toward this end, I’m going to paste a Table below that shows my progress to date; and later I’ll update the Table to show the effect of Pro Tools. I’ve used Ahnentafel numbers (male of an ancestral couple) – their names are not needed for this exercise, although I did use given names for children for the first two generations. The comment column gives some reasons why the cMs deviate from the averages as when there are double Cousins or half Cousins, or Ancestors out of the US. You may also note the high number of Matches for Ahnentafel 70 – it’s because I jumped to that Ancestor, and used Pro Tools to find several key Matches to help with a burning question.

Here is where I stand now:

Note that this summary has 2477 Matches, through the 5C level (4XG grandparents). I have another 6,070 Matches in the 6C to 8C group.  My total is 8,547 Matches from AncestryDNA, out of about 100,000 total – I wanted to see what impact Pro Tools will have. We’ll see how far I can get…

[22CU] Segment-ology: Pro Tools Part 13 – Status of Common Ancestor Spreadsheet  by Jim Bartlett 20241117

Pro Tools Part 12

Featured

The jokes on me… heads up!

In my last post I noted that the Pro Tools cM relatedness was pretty accurate! Today I found two Matches who were 1C – their parents were brothers. But the SMOM said 1,637cM they had to be half siblings. I checked with DNAPainter – 1,637cM is 100% half siblings (for same generation relationship). Back to the drawing board… Did the two brother marry (or have children with) the same wife? Maybe one brother died, and the other married the widow… Nope. Checking some more – the two brothers married two sisters! They were double 1C! Not in the DNA Painter range of options, but spot on for twice the 1C cMs. All is OK, but it had me scratching my head for a few minutes.

[22CT] Segment-ology: Pro Tools Part 12 – The Jokes on Me by Jim Bartlett 20241028

Pro Tools Part 11

Featured

Ways to analyze Shared Matches Of Matches (SMOM) cMs.

Pro Tools gives us a LOT of new information. Not quite segment Triangulation, but very powerful data.

For example a Match shares 8cM with me and does not have a Tree. However, a SMOM shares 3,489cM with the Match, and Ancestry (with insider info) says the SMOM is the mother of the Match; and shares 17cM with me. As it turns out, I know the SMOM is a 3C1R with me on a particular Ancestor couple. It’s easy to 1. add the Match to my Tree; 2 add the Match to my Common Ancestor Spreadsheet; and 3. add a synopsis of this info (as a 3C2R) to the Match’s Notes. Of course this doesn’t happen every time, but it does happen some of the time.

The above example is a parent/child relationship, and Ancestry usually knows if it’s a son or daughter and a mother or father. Ancestry usually knows niece/nephew and aunt/uncle.

But the thrust of this blog post is about a family group and their interrelationships.  I’ve tried several methods to document and analyze new Match/SMOM cMs. All methods utilize my Common Ancestor Spreadsheet which is arranged by family groups [I sort by Ahnentafel of the Common Ancestor; and the birth years of children, grandchildren and great grandchildren.] This CA spreadsheet is my foundation of “known” cousins – I’m looking at their Shared Matches to see if I can determine how we are related and add them to the spreadsheet; and checking to see that the existing cousins are interrelated to each other as expected.

First try was to add about 10 blank columns to the spreadsheet. I’d then type an asterisk [*] for a Match in a column, and enter the shared cMs with the other Matches in the spreadsheet in the same column. It was sort of like a Cluster matrix; and anyone who had a faulty genealogy was easily highlighted. But two issues: 1. It was a lot of work for a family group; 2. some of the Matches were in fact related up or down a generation [not physically close on the spreadsheet]; and 3. it was difficult (for me, anyway) to determine how an unknown Match would fit in… [someday I’ll try DNA Painter or BanyanDNA…]

The second try was just one new column, and I would type in the highest cM found among all the Shared Matches; the suggested relationship [almost always accurate for high cMs];  the Match name; and any known info. Issues: again, a lot of work; and some Matches don’t have any high cM SMOM with me.  I still add these when they are the only evidence I have for adding a new Match to my Common Ancestor spreadsheet.

Third/current try involves about 3 new columns and I color in a column where Matches match most of the others. Sort of like LEEDs column-coloring. This is somewhat easier to do, without a lot of typing. And the colored “stripes” are comforting to see (and to highlight Matches who may not “belong” and/or need further research.)

Also, I’m hopping around some these days, working on specific issues (Brick Walls, questionable genealogy, trying to link in (or out) selected Matches). It appears that the closer generations have one stripy column and as I work on more distant Ancestors, the number of colored columns grows.

I’m still fiddling with good/efficient ways to use/display SMOM cMs; or even if I need to at all. I’ve worked on about 10% of my Matches in the Common Ancestor Spreadsheet. At every turn, Pro Tools is helping me find more and more Matches for whom I can determine our relationship. So still a long way to go – and I’m sure there are many more Matches to add to my spreadsheet.

You are encouraged to post in the comments any insights, tricks or hacks you’ve developed for using SMOM cMs…

[22CS] Segment-ology: Pro Tools 11 – Ways to Analyze SMOM cMs by Jim Bartlett 20241027

Pro Tools Part 10

Featured

Branch Groups

I’m methodically working my way through my Ancestors and Matches using Pro Tools. My main tool is my Common Ancestor Spreadsheet, which is now growing very rapidly. I’m not really in it for the bulk, but for the advantages of Branch Groups. What I call Branch Groups are groups of my DNA Matches under one child or grandchild of one of my Ancestors – these Matches are on the same Tree “branch”.  Such Matches are closer to each other (than to me) and tend to share more DNA with each other. They stand out with DNA shares over 90cM; and I take notice. I can often “fit” them into a Branch Group. On the other hand, I’ve found some Matches that have the right genealogy for a Branch Group, but they don’t share much DNA with others in the Group – more on this below.

Here are some thoughts and observations:

SMOM – Shared Matches of Matches aka “Rabbit Holes” – haha.  When you select a Match and click on the Shared Matches button – you get a list of all the Matches you both have in your respective Match lists. These are your Shared Matches (SMs) with that Match. Each of these SMs shares some DNA with you that you both got from the same Common Ancestor (CA). And, with Pro Tools, you know how much DNA each of these SMs also shares with the “base” Match that *they* got from some CA. Often these two CAs are the same (or one is ancestral to the other); but sometimes the CAs are completely different – *their* CA could be unrelated to you or related to you on a  different line – see Outliers below). When we’ve done our homework and entered Notes for many Matches, we can usually look down the SM list and easily see if there is a consensus, or not – see Birds of a Feather below. Like with auto-Clustering, a consensus indicates a group of Matches that mostly match each other, indicating a Common Ancestor among them. Usually, their CA is also one of your Ancestors – BINGO! This is a Branch Group.  Sometimes their CA is unknown to you – this could be a random happenstance. Or it could be a Floating Branch Group – see below.

Branch Group aka Cluster. When you find SMOMs who share high levels of shared DNA (cMs) with each other they usually form a Branch Group. By “high levels” I mean at least 90cM; but I often drop down to around 50cM as the group grows larger. I consider 20-25cM as “in the noise”, and usually not worth the trip down a rabbit hole. [For your own situation, experiment to find a threshold that usually gives you efficient results.] Sometimes you can get 5-10 (or more) of these SMOMs which link under a child or grandchild or Great grandchild of one of your Ancestors. And then it’s easier to find other SMOMs that fit into the Branch Group. Use an SMOM in a Branch Group to make a new Shared Match list, invariably with new SMOMs… the clues (or rabbit holes) are everywhere! As it turns out in a Branch Group, not all Match descendants will Match all of the other Matches in the group. Remember: at the 4C level, roughly 50% of true 4C won’t show up as matches to each other.

Birds of a Feather.  On many Shared Match lists, a scan of the Notes indicates a clear consensus – most SMs have Notes indicating the same CA; and some are from the same line (up or down a generation). These are birds of a feather – they cluster together. And Pro Tools shows them to be close relatives – these are a Branch Group. In these cases, I’m much more likely to review Matches not yet linked in, and to build their Trees back to find the link. As a quick check, click on a Match and see *their* SMs with you – are they indeed Birds of a Feather? Or not?. For some Shared Match lists, a quick scan of existing Notes may indicate they are all over the place – on both sides; on different branches – so, it’s difficult to determine a consensus. Move on…

Outliers – linked by genealogy, but not linked by shared DNA.  I’ve now run into a very few cases of DNA Matches who are clearly genealogy relatives (in my Common Ancestors spreadsheet) under Ancestor XYZ, but they do not share DNA with other close cousin Matches under XYZ. In each case, so far, they are also related to me in another way, and they do share DNA with their other cousins.  Thinking about multiple segments and/or multiple relationships leads me to Triangulated segments, but I’ll put that discussion off for a future blog post. Just be aware that a Match with one shared segment can only be genetically related one way. Pro Tools may help determine which one.

Collateral SURNAMES in Branch Groups. Less than 1% of my Matches have the same SURNAME as the CA we share [Y and mt lines are pretty rare]. This means my Common Ancestor spreadsheet (tracking the lines of descent down to Matches) includes Collateral SURNAMEs. As I’m working on an MCRA Branch Group in my spreadsheet, I’m reviewing each of my Match cousins, and reviewing all of the  SMOM shared cMs, and checking the Trees of those over 90cM (and glancing at some down to 50cM). Often there is enough to tie those Matches to my Tree (even some with no Trees). It really helps to review the Collateral SURNAMEs already recorded in my spreadsheet for that Branch – that’s usually where I’m going to find a link. And it means I don’t have to build a tree back for each Match – I can usually copy the line of descent of an existing Match in the spreadsheet, and just change the last few generations. A big time saver – in searching and typing… Recognizing a Collateral SURNAME in a Match’s scrawny Tree is helpful. Sometimes I’ll filter a long Shared Match list by a Collateral SURNAME…

Floating Branch Groups. A few times I’ve found a Branch Group that I cannot link to my Tree. They usually include parent/children, siblings, aunt/uncle/niece/nephew, and maybe some 1C or 2C, all in a tight family group. All the interrelationship cMs are on target. But, other than being on a Shared Match list with some known Matches in a Branch Group, I cannot find a link. In most cases this has happened “near” a Brick Wall (or “iffy”) Ancestor of mine. So I’ve created a Floating Branch in my Tree, so I can link other Matches to it. I need to do a study of closest known Matches to see where this Floating Branch is headed – another rabbit hole. Such a Floating Branch could just be a mirage (not really linked to me), or I might find some “tendril” Matches (maybe through a Collateral SURNAME filter) that help find the link. I operate under the belief that ALL Matches over 15cM (and many under 15cM) are true cousins, and many are within a genealogy timeframe and should fit in my Tree somewhere.   

I am now convinced of two things: A) A lot more of our under-20cM Matches are well within our genealogy timeframe than I originally thought; and B) our Brick Walls (out to at least 8C level) have plenty of Matches forming Branch Groups. With each generation going back, it’s harder and harder to figure them out, but Pro Tools can often provide new insights. This helps offset the fact that many Matches have NO Trees or very scrawny Trees. There is hope! But it takes work!

[22CR] Segment-ology: Pro Tools Part 10 Branch Groups by Jim Bartlett 20240812

Pro Tools Part 9

Featured

Build A Foundation

I feel like I’ve been drinking though a fire hose – there are just so many good clues in the Shared Match cM lists. I’ve tried all of the four Plans of Action I previously laid out – and I’ve found myself still jumping from one to another – good clues are just too hard to pass up. And a parent/child, sibling, aunt/uncle/niece/nephew and even a 1C will suck me in like a magnet – particularly when one has NO Tree and another has a good Tree. AND, if I’m working in a small sub-branch so I know many of the collateral SURNAMES and the geography,  I’ve got to capture that info before I move on…

Observation 1: As I scroll through hundreds of Shared Match lists, I see lots of Shared Matches with the same MRCA I’m researching [almost all of my over-20cM Matches have a Note indicating a validated MRCA, or a likely/imputed one]. And I see lots of Shared Matches one generation up or one generation down. For instance, I’m working on my MRCA couple 40P, and I see Shared Matches that are also 40P, and Matches who are 20P and 80P and 82P. I shouldn’t be surprised, because we are all on the same ancestral line; AND a 20P Match who is a 3C (or 3C1R) with me, is also related to my Matches on 40P – maybe as 4C or 3C1R, etc. This is very comforting to see a Match with Shared Matches up and down one of my lines.

Observation 2: Each of the MRCAs that I focus on – usually for a few days – has seen a significant increase in the number of Matches that I can verify exactly how we are related. Plus, if they are closely related to a known Match AND have a bunch of Shared Matches with me along this same line, I can add them to my Common Ancestor Spreadsheet anyway, with confidence they are on the same sub-branch.  In any case, I’m winding up with a lot of Matches under each MRCA; and a lot of new Notes for them.

Recommendation/Tip: Combining 1 and 2 above, I now think the best path forward is to build the foundations and then work back in our Ancestry.  I have no 1C, so this means starting with my great grandparent MRCA couples and, using Pro Tools, teasing out as many Matches as possible for each one of my 4 MRCA couples [8P, 10P, 12M, 14M] – and adding their info into the Match Notes. Then, as I move to the next generation further back, I will see many of these Notes in the Shared Match lists for Match-cousins back to MRCAs16P to 30M. In general, the Shared Matches to these MRCAs will “stay in their lane,” and that is a strong indication. Remember, some Shared Matches may match you one way and match the base Match another way – those Matches will usually have a shorter, random list of Shared Matches – I skip over those quickly and move on.

Bottom Line: If we start with our closest MRCA couples and “Note” all the Matches we can, we’ve built a strong foundation for when we get to the next generation. This will become more and more valuable as we work out through more distant generations. I think such a foundation will be essential when we get to 4C and beyond.

[22CQ] Segmentology: Pro Tools Part 9 Build A Foundation; by Jim Bartlett 20240724

Pro Tools 8

Featured

Group Process

Here is (sorta) my process for working with a Match and their Shared Match list with me.

1. Pick a Match with an MRCA (I don’t have good criteria yet, but I like one with a good Tree; and it’s helpful to know that they have several close cousins in my Common Ancestor spreadsheet).

2. First pass: look though all of their Shared Matches for Notes that indicate they share the same MRCA.  [sometimes I note the shared cM in a new column; sometimes I just use a highlight color in a column – in either case to indicate a group with the Match in #1]

3. I’ll stop at any Match who is very close – parent/child; sibling; even aunt/uncle/niece/nephew. If not in my spreadsheet, I add them in and add appropriate Notes to their profile to highlight the MRCA and relationship to me – e.g. #A0038P/4C1R: ALLEN/Elizabeth.

4. Then I make another pass through the Shared Match list – opening the Matches who share above 90cM (generally within about 2C to the original Match in #1 above).  From my spreadsheet I know of other SURNAMES the other Matches have in their path back to the MRCA – so I’m looking for those surnames in addition to the MRCA surnames. For example: MOESZINGER led me to 4 other new Matches from my ALLEN MRCA.

5. Repeat #4 (a third pass) looking at above 50cM or so – digging a little more (and by this time, I usually have additional Matches with helpful Notes to play off of.

6. Now, start at the top of the #1 Shared Match list (a fourth pass), and open each Match who does share the MRCA with you. Look down each such Match’s Shared Match list with you, using the #4 process above. The idea here is that not every cousin will share with every other cousin (remember only 50% of all your true 4C will share DNA with you; 50% will not!). So using this step usually adds a few more Matches to the group. [If you use a highlight color column, all of the Matches in a part of a Common Ancestor spreadsheet should get colored in.]

7. If you’re working on a Brick Wall (or NPE or bio-Ancestor, etc), go through the remaining Matches who have Trees and jot down the SURNAME in their Trees. Look for a Common Ancestor among those (usually more distant) Matches, who would be a good potential for an Ancestor at or beyond the Brick Wall.

In each case above, I add new Matches to my Common Ancestor Spread Sheet (now about 7,000 from Ancestry), and add them (and their path) to my Ancestry Tree (they are always living and private).  

Sidebar: My Common Ancestor spreadsheet is a good tool for each family group based on an MRCA. I haven’t found a good way, yet, to analyze the Shared Matches who are related to me through the children or ancestors of the MRCA. However, I do note that they show up in the Shared Match list. For instance, I’m now working on my MRCA – 38P (this is the Ahnentafel representing Joseph ALLEN, along with his wife Elizabeth [39P on her own] – maiden name unknown). It’s comforting to note that many of the Shared Matches have Notes starting with #A0018P (an MRCA representing my Ancestor who married A19P, the daughter of 38P) and some close Matches with #A0008P), an even closer descendant of 38P. Normally, I would have some smaller cM Matches back to 76P and 78P (representing the two MRCA couples who are ancestors of 38P and 39P, but both 38P and 39P are brick walls…  So Group Process #7 above, is next on my list.  

The above is a classic example of the iterative nature of genetic genealogy, and the importance of having a good Note system that lets you see the key elements in a Shared Match list. It all comes back to doing the homework of keeping good, visible, Notes at Ancestry. Tip: I now add a Match’s SURNAMEs to the Notes if I don’t have any other clues – I can then see these SURNAMEs in the Notes fields in a Shared Match list…

Bottom Line: I think the Pro Tools Shared cM feature needs an iterative process of reviewing Shared Matches to add in as many new Matches as possible under our MRCA groups. This also includes noting Shared Matches closer and more distant to each MRCA group; and analyzing remaining (usually smaller cM) Matches to break through more distant Brick Walls. Lot’s to do….

[22CP] Segment-ology: Pro Tools Part 8 Group Process; by Jim Bartlett 20240719

Pro Tools Part 7

Featured

What Is Your Plan of Action?

The Pro Tools feature that lets us see the amount of DNA (in cMs) between our Shared Matches is a significant tool. It allows us to “stitch together” families, to include Matches with skimpy, or even no, Trees. This could potentially impact all my 96,000+ Matches. That’s a lot of ground to cover…

So what’s the game plan – how do we most efficiently use this new cM data? What is the Plan of Action (POA)?

I see four different POAs – and I’m seeking your input on any insights you’ve found so far.

1. Work down our Match list. Start at the top, and methodically work on each Match that we haven’t placed in our Tree. The advantage here is that the top Matches (most shared cM) are usually the easiest to figure out. With Pro Tools we can see their top Matches, potentially ones with good Trees, and often tease out their place in our Tree. At the least, even if we cannot find the exact relationship, we can figure out which sub-branch of our Tree they are on (which is all we really need to know for them to be helpful forming a tight group).

2. Confirm each MRCA couple group. I’ve been working on this method for a while, using my Common Ancestor Spreadsheet. The focus is on all the Matches who have the same MRCA couple – does each one share an appropriate amount of cMs with the others. I must take care when some Matches have multiple relationships with me (colonial Virginia ancestry) and/or multiple segments – these could throw off a one-to-one analysis. But the main point here is: does each Match “fit”? I’ve found 2 so far (out of hundreds), who really don’t “fit” within all the shared cMs – indicating incorrect genealogy or an NPE. Bottom line: it is very comforting to see a large list of Matches under an MRCA that all “fit” each other (well within the Shared cM Project ranges). Each such MRCA couple at one generation, then is a strong foundation when working on the next generation – many Matches will be related to each other across two (or more) generations. More “comfort”…

3. Focus on specific problems.  Work on an unknown bio-Ancestor/NPE/Brick Wall. Build an appropriate group, and then re-review the Shared Match list for highest-cM Matches that may be helpful – and then look at their Shared Matches for more clues. This POA may foster a lot of “rabbit holes” and “blind alleys”. But the main point here is: build a group of interlocking Matches – they will often lead to insights.

4. Hit-n-Miss. Have fun chasing random leads. These sometimes result in a floating branch of your Tree. I have two of these – many Matches which apparently form a large (several hundred) list of Matches from one person – probably an Ancestor of mine, but no known paper trail link. Pro Tools will confirm, or not, if this is an “interlocking” group. If so, then I will look for Matches in that huge group, who have shared Matches with some other, known, MRCA group(s) of mine – hopefully there will be a strong consensus – there should be…

[Side bar: early in my Navy career (1971), I needed a system to track a lot of projects (before PCs) – we had a table of milestones. I called it: the Hectic Input and Tabulation of Numerous Milestones in our Sacred System (HITNMISS) – my boss was not impressed; so I changed the name to the Simplified Work Input and Follow Through (SWIFT) – I got promoted:>j]

I think the unifying theme above is to form interlocking Matches into a family group based on shared cM – it’s right up our alley as genealogists. And each such group is a very valuable foundation, with important links in different generations (up and down our Tree).

Are you using any of the above POAs? Have you developed a different POA that you’ve found to be particularly effective and efficient, or not? What’s the best way to incorporate all this new data? Please share.

[22CO] Segment-ology: Pro Tools Part 7 What Is Your Plan of Action; by Jim Bartlett 20240717

Pro Tools Part 6

Featured

Watch Out…

BLUF: Do not rely strongly on Ancestry’s suggested relationships – I find the true relationship is rarely the top one in Ancestry’s long list of possibilities; and it’s usually down their list somewhat. The cMs with my Matches are always within the ranges in the Shared cM Project and at DNA Painter. But, again, they are rarely at the average.

I’m reviewing all my Matches at the 3C level: 79 Matches with A16 MRCA couple; 92 with A18; 43 with A20, so far. None have been found to be outside the range of inter-relationships (perhaps 50% sampling). All are inside the appropriate ranges. BUT, two siblings may show vastly different values – one somewhat higher and one somewhat lower than the average.  My engineer brain wants two siblings to have very close cMs, but the data is truly random (within the ranges of the Shared cM Project.

Bottom line: be careful, and don’t try to force a fit. Expect the values to be in the range for the relationship; but accept that they may be all over that range. And looking at it the other way, starting with a shared cM value, the relationship [of a Match without a Tree] will NOT necessarily (or even probably) be in a small set of Ancestry suggestions (although almost always on their long list – in the “Tree” view of a Match profile).

[22CN] Segment-ology: Pro Tools Part 6 Watch Out; by Jim Bartlett 20240711

Pro Tools Part 5

Featured

Small Segments At Work

I have Match A at 25cM with no Tree; but a nephew at 1773cM, Match B who has a Tree. Match B is a 3C3R on my Ancestor 16P.  Analysis of the 1950 census and his grandmother’s obit, gave me the same name as Match A and a place in my Tree. Match B is 9cM to me. Match B has another uncle at 1771cM, Match C. Match C is also listed by name in the grandmother’s obit. Match C is 8cM with me.  And, sure enough Match A and Match C share 2315cM [siblings] with each other [corrected 7/9/2024].

This is about as solid as it gets. Clearly the 9cM Match B and the 8cM Match C are true cousins to me, per genealogy. Each of these Matches share one DNA segment with me. Although this data doesn’t “prove” these 3 segments are the same and linked back to our MRCA 16P, I’d be willing to wager that an upload to GEDmatch would show these segments would Triangulate; and match many other segments from MRCA 16P.  In a genealogy sense it doesn’t matter: these Matches belong in my Tree – with or without a DNA link.

I find this example compelling. The old saw: when you hear hoofbeats, think horses. Yes, zebras are a possibility, but the odds in the USA are way in favor of horses.  These individuals show up as DNA Matches to me – they share a segment of DNA with me. Some segments are small, some are large. When they come from such a tight fit in one part of my Tree, I’m inclined to believe that they are the same segment. It is “possible” that they each got a randomly different segment, or even false segments, but the logical reasoning is that they share part of the same segment from an MRCA. Why not just accept that for now? Perhaps, someday, some alternative will come up – even so, it would not change the genealogy backed up by records.

Icing on the cake – in reviewing Match C’s shared Matches, Match D (8cM to me) is 3476cM (a daughter) of Match C – another add to my Common Ancestor spreadsheet and to my Tree.

Bottom Line: ProTools is providing a lot of great bread crumbs to follow; and linking a lot of small cM Matches to my Tree. Be sure to scroll to the bottom of a ProTools Shared Match list, looking for high cM interrelationships! Don’t discard genealogy “finds”, just because they share small cMs.

[22CM] Segment-ology: ProTools Part 5 Small Segments At Work; by Jim Bartlett 20240707

Pro Tools Part 4

Featured

The Spreadsheet

By popular request, below is a section of my Common Ancestor Spreadsheet. Shown are most of the essential columns, In order to fit the space I have in this blog, I’ve deleted a number of columns that I use to record, emails, TGs, Notes, Y or mtDNA possibilities, etc. – they are not pertinent to point of this post. On the far right are 3 columns for cMs between a Match and the *Match for that column.

Common Ancestor Spreadsheet with columns for Shared cM between Matches

Note this part of the spreadsheet is for DNA cousins on my Ancestor John H BARTLETT b 1804 (married to Sarah FLEMING). For each Match, I have their Name, any Admin, cM (with me), # segs, Ahnentafel of MRCA couple (all are 16 in this section), Cousinship; and then the given name and birth year of the child of the MRCA through which they descend; same for grandchild; and Great grandchild; and then a column for more descendants if desired (all in one cell – and I usually run this out – down to the Match). The ** in green means that Match (and the path) is in my Tree. The next columns are for entering an * for a Key *Match and the amount of shared cM between the other Matches and that *Match.   You’ll note near the bottom of the spreadsheet, child James b 1836 is listed – he is the child that I descend from (and cousin on him would be under Ahnentafel 8). Hundreds of other MRCA couples, thousands of Match cousin, are in other sections – all sorted by Ahnentafel # and birth year columns.

To do a perfect matrix, I’d need to have 67 columns to show all of the pair-wise relationships. I think I can get a pretty good picture from only one Match for grandchild. And, of course, as I find Shared cMs over about 100, I usually go down each of those rabbit holes and wind up adding most of those Matches to my spreadsheet.

 Please feel free to use as much of this format as you link, AND to add/delete/shift columns to suit your own style of research and analysis.

[22CL] Segment-ology: ProTools Part 4 The Spreadsheet; by Jim Bartlett 20240705a

Pro Tools Part 3

Featured

BLUF – The matrix which can be created by all the shared cM relationships is also showing the range of cousins who don’t Match each other.

I have now shifted to using my Common Ancestor Spreadsheet to analyze the cMs between my Matches. This spreadsheet lists about 9,000 Matches who are known cousins on specific Ancestors (a small percentage of Matches share multiple Common Ancestors with me). The backbone of this spreadsheet is a list of all my Ancestor couples out to 8C level (and some beyond), with columns for their Ahnentafel number (e.g. 16); and husband’s birth year. Under that goes a row for each Match with that Ancestor as a Common Ancestor (with the Ahnentafel number and cousinship (e.g. 3C1R) and the Match’s given names of the child and birth year the CA couple. The next two columns to the right are the Match’s Ancestor who is the grandchild of the CA, and their birth year; etc. With this setup, I can sort on Ahnentafel Number and the first birth year column and then the second birth year column and the whole spreadsheet sorts into family groups.

I am now selecting a Match and entering a * in a new column; and then, in that column, the cM of their closest Matches already in the spreadsheet. [NB: As previously reported, I’m also finding Matches who are very close relatives to the *Match (sometimes a parent or child or 1C), which causes me to go down that rabbit hole – which, in turn, frequently results in a new known cousin Match added to the spreadsheet – it’s like drinking through a fire hose.]

Anyway, as I now look down the amount of Shared cM between Matches (in a * column), I can clearly see the parents/children, siblings, aunts/uncles/nieces/nephews and close 1C and 2C in close rows of the spreadsheet. The Shared cMs get smaller and smaller up and down the spreadsheet – in fairly predictable order as the spreadsheet has different “layers” of relationships – it’s very comforting to see this pattern. Mind you, it’s not a straightforward “curve” – there is the same “jumble” that is reflected in the Shared cM Project cMs – the overlap of possible ranges among different cousinships.

The other thing that is showing up under a *Match, is that not all the 3C or 4C or 5C are showing up as Matches. This is expected. Remember the rough estimates that true 3C only match 90% of the time; and 4C only match about 50% of the time; etc.  I would need to have 9,000 columns, to perform a full analysis, and that probably isn’t in the cards. Perhaps one of the 3rd party programmers can come up with a automated program to do this… 

Bottom line: for now, it appears the concept of “true cousins don’t always match each other” is alive and well in the Shared cM data…

[22CL] Segment-ology: ProTools Part 3 TIDBIT; by Jim Bartlett 20240705

Pro Tools Part 2

Featured

A ProTools Epiphany…

As I walk down my hitherto unknown Matches, I’m setting up a small spreadsheet for each group.

Important: Starting with a “base” Match, scroll through *all* the Share Matches – looking for those who share lots of shared cM with each other. Generally 90cM is a good threshold – these Shared Matches (with each other) would generally be 1C or 2C to each other. If I drop down to about 50cM, I get 3C & 4C too. Feedback from the LEEDS method indicates these over-90cM Matches tend to share the same grandparent.  This can occur between two Matches who share much less cM with you. Your Matches may be fairly distant; but among themselves they are closely related. Often, some of these Matches are known to you – either through a good Tree or a ThruLines clue (with reference material). Looking through *all* of the Shared Matches, and then through *their* Shared Matches, I’ve usually found a group of Matches who are closely related to each other on some branch of my Tree.

Epiphany: At this point it is not critical, or even necessary, to “pin the tail on the donkey” precisely. These Matches may well be 3C or 4C or 5C or more to you, but they collectively anchor a sub-branch of your Tree. The fact that they share high-cMs with each other, is a very strong indication that their bond is strong and correct [classic genealogy triangulation]. And, even though they are more distantly related to you, their *grouping* is a strong indication that they are related to you through your Common Ancestor to that sub-branch.

Each of the Matches in this sub-branch (including those without Trees of their own), becomes a strong “tell-tale” that tracks a Shared Match Cluster and/or a Triangulated Segment of your DNA.

There is so much new ground to cover here, that I’m now shifting my focus. Instead of trying to fit each Match into a specific place in my Tree, with detailed genealogy research, I’m just highlighting the groups who are clearly descended from a specific person in my Tree. This specific person may be a child, or grandchild, or great grandchild of my Ancestor. At this point, it doesn’t add any more value to my Tree as a whole to know exactly how they relate to each other – just that they do closely relate to each other. Five or ten or twenty of my unknown Matches are now under a grandson of one of my specific Ancestors – although, they may all be around 5C to me. Other Matches cannot be in that sub-branch, unless they share an appropriate amount of DNA with others in that group. So, using inverse logic, we must find a different sub-branch for these other Matches

This process remains a hoot, and a game changer at Ancestry. I really think a large percentage of our Matches can now be correctly put into sub-branches of our Trees. This also highlights Match groups which will be helpful in getting through Brick Walls. Every IBD Match has to tie into some part of our Tree – ProTools is looking like a great tool to help place those Matches, and perhaps identify some small-cM false Matches. For me this clearly helps, identify small cM Matches who are related within a genealogy timeframe (as opposed to being very distantly related).

[22CK] Segment-ology: ProTools Part 2 TIDBIT; by Jim Bartlett 20240702

My Take on Ancestry Pro Tools

Featured

A Segment-ology TIDBIT

Ancestry ProTools includes several features. ProTools costs $10/month extra on an existing Ancestry account. This post is focused on the cMs between Shared Matches. I’ve fiddled with it for a few days, and, of course, have come up with a helpful spreadsheet.

One method: Focus on a “base” Match of interest to you.

Start with a Match of interest to you (often a high-cM Match with an unknown, or iffy, link to your Tree). I call this the “base” Match. Click on Shared Matches (to use ProTools or subscribe to it).

The resulting Shared Match list (with ProTools) took me a while to get used to. It is essentially a list of the Matches that you and your selected Match have in common. This is a fundamental building block of Shared Match Clustering, and Matches who appear on each other’s Shared Match lists tend to all have the same Common Ancestor. These Clusters can include Matches with Trees (where you can search for a Common Ancestor among them); as well as Matches with Unlinked Trees, Private Trees and NO Trees)

However, the ProTools Shared Match list also reveals the cMs shared between your “base” Match and each of the Shared Matches on the list. These cMs may, or may not, be significant information. So far, the list is only arranged by cMs shared between you and the Matches. I’ve found my “go to” process is to scroll down the right hand list and check the cMs shared between your “base” Match and each of the Shared Matches. This is often a wide range of cM values – from 20cM up to some real surprises. These surprises may be on the last page of Shared Matches – so, for me, it’s well worth the time to look at all the pages of Shared Matches (20 Matches per page). There is a rumor that Ancestry is working on way to let us sort on this value. I have found several cases of a relatively small Match to me, who is a parent, or child, or sibling, or other very close relationship to the “base” Match. This is often a “BINGO” for me – particularly when one or the other of this duo doesn’t have a Tree. These close relationships can also be game changers – 1C, 2C or even 3C can show a family group in one “sub-branch” of your Tree – importantly, separated from other branches. 

Inverse Logic: If you are pretty sure of some Matches who descend from one child of a particular Ancestor, and a group of Matches (among themselves), appear to be on the same line, but their cMs with you are somewhat smaller than the other Matches from that Ancestor, then this is a strong clue they are related another generation back, or so.

In any case, this info can be very valuable in conjunction with a WATO analysis at DNAPainter.

Another method: Work on your top Matches on one branch of your Tree.

Of course I tried several spreadsheet methods. The one that works best for me is a list of my top Matches on one branch of my Tree. I determine these Matches from my Notes (derived from ThruLines; Clusters; UnListed Trees; blind luck; etc). Almost all are captured in my Common Ancestor Spreadsheet – here). Since I know how most of my Matches relate at the grandparent level, I focused on the Great Grandparent groups and/or 2xG Grandparents who were on my paternal side. In other words, on known, or suspected, Ahnentafels: 8P, or 16P and 18P, or occasionally one more generation back (32p-39P).

I walked down my Paternal List of Matches and selected the ones I had Notes for that indicated they were from my targeted branch, or, based on previous Clustering, who were Likely to be on that targeted branch (Likely Matches were labeled with an “L”, and usually had NO or very small Trees.) I listed the Match Name, cM, Relationship (e.g. 8P/2C1R), and sometimes the Child the Match descended from. Feel free to add any columns that might be helpful to your analysis – columns can always be moved or deleted or hidden.  Out of this list I selected a Key Match (often unknown) and put an asterisk (*) adjacent to them in a new column. I then clicked on the Key Match’s Shared Matches and reviewed that list – on the right side was the shared cM with each Match. Initially I went from top to bottom of that list and put the shared cM amount in the column under the * and in the row for the match – creating a matrix of sorts. After a few iterations, I limited this to shared cM amounts over about 50cM and highlighted amounts over 90cM. As indicated above, I sometimes found very large cMs, indicating very close relationships – clearly on one particular branch twig in my Tree; and sometimes one Match had a full Tree and the others did not (very useful, bringing Matches with little to no info into play).  One vexing Match has a father born the same year as me, so I can assume a 1R relationship (and her 162cM is 2C1R 53% of the time per DNA Painter). AND I note she shares 1883cM with another Match who is highly suspected of having a NPE bio-parent in my Tree) – the clues are adding up.

The method above is also creating a sub-branch, that could very well be from an unknown wife/mother (39P) for whom I have very few Matches so far. In these cases, I’m creating additional * columns for the highest cM Match in that group and looking at their Shared Matches – looking for one of their closer Matches who might have a Tree; or looking for other Shared Matches who might provide Trees or other insights – all in all: looking for a Cluster that might go back to 39P…

As I’m playing with this method, and adding more * columns (creating a matrix), I’m basically identifying all my Matches on my 8P/9M MRCA branch, and subdividing them into sub-branches. This will get me to a good Cluster from Matches back through 8P/9P MRCA to 18P/19P to 38P/39P and ultimately to the 78P/79P MRCA who are parents of my unknown wife/mother: 39P.

Traditional Clustering methods can do this alone, but knowing the cM relationship between the Matches helps a lot.  

Clearly I’ll be spending time with this new spreadsheet. I can add new Matches that are close to my key Matches but may be under 50cM, or even at 20cM, with me, but with helpful Trees and/or Unlinked Trees.  At any rate, its easy to sort the spreadsheet on an * column, and easily see Matches who should be grouped on a sub-Branch. And, at any time, I can easily use DNAPainter’s WATO tool to focus on likely Branches. It’s a whole lot easier to find a link by building a Match’s small tree back, when I have good intel on the Surnames and geography and timeframes.

ProTools identification of shared cMs between Matches is a strong addition – well worth $10 for a trial month, IMO.

Please feel free to post your own methods of squeezing out more info using this feature of ProTools.

[22CJ] Segment-ology: My Take on Ancestry ProTools TIDBIT; by Jim BARTLETT 20240629

Shared Segments for Small Segments

Featured

The Shared cM Project is an important and powerful tool for genetic genealogy – particularly with it’s integration with the DNA Painter tools. Over 60,000 submissions is impressive.

Two observations on the Shared cM Project – a very high percentage of the submissions were for the closer relationships; and the data was from many different users (perhaps with varying degrees of accuracy).

I now have over 9,000 entries in my Common Ancestors spreadsheet [see my  blogpost about this spreadsheet tool]. I’ve curated these down to 7,800 entries from 1C to 8C, that I am pretty confident are correct. Also, my analysis is that there is a high probability, based on Trees and Shared Match Clusters, that each shared segment is from the Common Ancestor.

So I decided to compile cM statistics from my own curated data. I also wanted to see how the small cM relationships played out.

My data is not nearly as robust as the Shared cM Project was for 4C and closer relationships. However, in the 5C range my data was closer; and in the 6C to 8C range I generally had more data points than the Shared cM Project. This reflects my emphasis on all Ancestors out to 8C range.

Overall, there were no big surprises. In general, for 6C to 8C my data was in a tighter range; and I had some data for distant relationship that weren’t in the Shared cM Project (but, no surprises)

Bottom Line: In my opinion the ranges in the Shared cM Project are a little broad – probably a reflection of data from so many sources. I think the broader ranges give folks more wiggle room with low percentage probabilities, when they should really be looking for other possibilities.

Here is my table comparing my data with the Shared cM Project data – the top row indicates the full cousinship; once removed (1R) and twice removed (2R):

The significant increase in data points at the 6C level reflects the power of ThruLines to build Trees back (subject to my review); but only out to 6C.

As always, feedback is welcomed.

[06F] Segment-ology: Shared cMs for Small Segments; by Jim Bartlett 20420605

Which Sibling Is the Bio-Ancestor?

Featured

A Segment-ology TIDBIT

Up Front – it’s the one with the highest average cM among Match cousins.

Setup: You’ve pretty much determined a particular couple are bio-Ancestors to youself (or someone else) – often by a consensus of Match Trees in a group (usually a Cluster) – see here. However, this bio-couple had a number of children. Which one of them was the bio-Ancestor? It gets harder and harder the more generations back you are researching.

Process: I’ve had good outcomes by determining as many DNA Match cousins as possible for the bio-Ancestor couple. Line up the DNA Matches and the shared DNA cMs under each of the children, and then determine the average cM for each child. In general, one of the averages will be somewhat more than the others – even when you don’t know the link. That’s because you are a closer cousin with Matches who descend from the same child as you do.  For instance, you may be a 5C through most of the children – sharing an average of 25cM with those Matches; and you would be 4C with the Matches who descend from then one child who is your Ancestor – sharing an average of 35cM with them. Of course, our results may vary somewhat from the Shared cM Project, but it’s the concept we are focused on here.

When I do this analysis, I drop down into the smaller segments, in order to get a fair comparison among all the cousins I can find. The more Matches we use, the more it averages out to the Shared cM Project and the correct bio-Ancestor child.

[22CI] Segment-ology: Which Sibling Is the Bio-Ancestor? TIDBIT by Jim Bartlett 20240403

Celebrating the First 25 years of Genetic Genealogy

Featured

Free eBook: Genetic Genealogy: The First 25 Years – 82 pages – the reflections of 34 Contributors – compiled and edited by Diahan Southard. This is a fascinating read from cover to cover. And it’s free to download here: https://diy.yourdnaguide.com/so-far

I am honored and humbled to be included in this project. And a grateful hat-tip to Diahan who conceived this project; herded the cats to gather the various perspectives; curated and edited the inputs and got it ready before RootsTech 2024. And made it free to everyone!

Thanks, Diahan Southard.

[99C] Segment-ology: Celebrating the First 25 years of Genetic Genealogy by Jim Bartlett 20240229

ThruLines Is Quick – Really Quick!!

Featured

A Segment-ology TIDBIT

My previous post noted that ThruLines quickly adapted when I changed my Tree.

Setup: I have looked at every one of my ThruLines Matches. If you are not sure, just open your DNA Matches list and select the Filters: Unviewed AND Common Ancestors. If you’ve looked at them all (and hopefully added appropriate information in the Notes box for each one), after a minute or two you’ll get a message: No matches match the selected filter. You’re now ready to take advantage of this status.  

I have a pesky female Ancestor. I’m not really positive where she fits in a larger part of my Tree (or to any of several floating branches).  So I called up her profile; clicked on Edit (top right); clicked on Edit relationships; and clicked on the parent “X”s (to separate, not delete, them). I now went to the Father box and clicked on Add father; and typed in a name I wanted to test as a parent. I then closed the Edit relationships page and went back to my DNA Matches List and filtered on Unviewed AND Common Ancestors…. and ThruLines immediately populated appropriate new Matches who would be cousins through that parent. In the one to two minutes it takes ThruLines to search my 93,000 Matches, it found and listed Matches with ThruLines. Since I had already opened all previously known ThruLines, this new listing was only Matches who were related through the change I had just made. I quickly took notes and reset the original pesky Ancestor. Ready for the next trial. In and out very quickly.

There is more to this story for a later blogpost. The point for this blogpost is twofold:

1. AncestryDNA must already have most of these relationships already worked out, just waiting for me to ask the right question (do you have cousins for “this” relationship?)

2. There is no waiting days for a “refresh” – ThruLines reports as fast as it can scan my Match list (down to 6cM). Just WOW!

Both of these are pretty amazing, IMO.

[22CH] Segment-ology: Thru-Lines is Quick – Really Quick!! TIDBIT by Jim Bartlett 20240228

ThruLines is Quick!

Featured

A Segment-ology TIDBIT

I was entering a ThruLines line of descent into my Common Ancestor Spreadsheet, when I noted an error in the Match’s Tree. The Tree and ThruLines were at 6C. When I inserted the missing generation in my Tree, the relationship changed to 6C1R. As soon as I clicked back to the Match, the ThruLines was gone!  AncestryDNA now *knows* the correct relationship, and since it was beyond 7 generations for one of us, they won’t show it.

Heads up. Copy or screen-shot before you lose the ThruLines link. I guess in a pinch, I could go back to my tree, take out the generation I added, and ”reincarnate” the ThruLines link. Sometimes you have to think like a computer…

[22CG] Segment-ology: ThruLines is Quick! TIDBIT by Jim Bartlett 20240225

AncestryDNA Side vs ThruLines Side

Featured

As I look at ThruLines Matches under 15cM, roughly half of them have a Side (Maternal or Paternal) which is different from the Side of the Common Ancestor proposed. What’s up?

AncestryDNA has determined a “side” (Maternal or Paternal) for most of my Matches. Pretty slick! And very helpful!! For above-20cM Matches they appear to be fairly accurate. This is despite the fact that all of my Paternal and half my Maternal Ancestor were mostly from Colonial Virginia. I was expecting a lot of Matches to be “Both”, but relatively few are. The bulk of my Matches are in the Maternal and Paternal categories. And below 15cM, the Maternal or Paternal “sides” are not aligning with the “side” for many of the ThruLines Common Ancestors. Side note: it appears that Ancestry is now only reporting one ThruLines Common Ancestor per Match – they used to report two or three if they found them….

What are the possibilities?

1. The AncestryDNA “sides” may be incorrect. I’d like to think (hope?) that the science behind them is valid and that they are largely correct. Most of mine above 20cM appear to be.

2. The ThruLines may be incorrect. This is a genealogy area (not DNA). With my 50 years of genealogy research, I already know many of the descendants of my Ancestors, and I run a check (not-GPS-comprehensive) on each ThruLines reported. I used to spot about 5% with errors (some of which were easily fixed), but now there are more and more as AncestryDNA appears to have become fairly aggressive at finding Common Ancestors. It appears they have loosened up the algorithms to allow “close” name variants and “close” dates, resulting in more false results. But even with the ThruLines I review and accept the Common Ancestor from a genealogy point of view, there are roughly half which don’t agree with the “side”.

We cannot have it both ways… or can we?

When AncestryDNA determines a Maternal “side”, does that guarantee that 100% of the Match’s atDNA can only be on my Maternal side? I really think that is absurd! Particularly when you consider most of my Ancestry is from Colonial Virginia. Surely my Colonial Virginia Matches could descend from Ancestors who would be on both sides of my Ancestry. In fact, I have several of my own Ancestors who, due to distant pedigree collapse, are on both sides of my Tree.

I think it is entirely possible that the bulk of a Match’s atDNA could align with my Paternal or Maternal DNA, but that some of the Match’s segments could be from the other side. I’m scratching my head over whether or not this could occur half of the time.

3. Both Ways! My conclusion is that we can have it both ways! I have a colored Dot for cases with both “sides”, but I’ve decided not to let that, by itself, stand in the way of accepting a ThruLines Common Ancestor as valid.

I’m curious about your overall experience and observations about conflicting “sides”. You are encouraged to add your insights in the comments.

[35AA] Segment-ology: AncestryDNA Side vs ThruLines Side by Jim Bartlett 20240213

ThruLines Levity

Featured

A Segmentology TIDBIT

Ancestry’s ThruLines is like “dumpster diving”… sometimes you have to dig through the trash to find the pearls. Sometimes there is a smorgasbord of various genealogy junk, but sometimes there is a treasure trove of good information. Pick and choose wisely…

[22CF] Segment-ology: ThruLines Levity TIDBIT by Jim Bartlett 20240211

Let the Chips Fall Where They May

Featured

A Segment-ology TIDBIT

Thinking about Small Segments and Distant Matches…

Many have used the Speed and Balding IBD Statistics in Figure 2 of their Paper …  This chart has often been used to scare us away from small segments [by small I mean 7-to-15cM Shared DNA Segments – I do not encourage anyone to use smaller/”tiny” segments].

The vast majority of our Matches at AncestryDNA fall into this 7-to-15cM category, and I get many ThruLines Matches which have valid paper genealogies. They may not all link to the DNA, but I see no reason to discount them based on the small size of the Shared DNA alone. ThruLines is limited to Matches who are related as 6C or closer – not what I would call a “distant” Match. Only the small Shared Segments and the constant reference to the Speed and Balding chart, warning that small segments are usually distant, stand in the way.

This got me to thinking (watch out!)… The AncestryDNA Timber algorithm is well known to “down weight” the cM of many of our Shared DNA segments. Click on the “DNA” line in any Match Profile to see the “Unweighted share DNA” amount – often somewhat larger than the amount shown on the DNA Profile. This is Timber at work, downweighting the DNA that would be shown at, say, GEDmatch. One of the effects of this downweighting is that many of the AncestryDNA customers who would show up as a Match at GEDmatch are never shown as a Match to us at AncestryDNA!  It seems to me that AncestryDNA has already compensated for the statistics reported by Speed and Balding. It is thus unfair to compare our Match lists with the Speed and Balding statistics.

I’m not saying that some of our Matches are not distant – some of them are. What I am saying is to let the chips fall where they may. If we can find a Common Ancestor – at *any* Shared cM amount – why not accept it (if it also passes a genealogy review). The Shared cM Project clearly shows small Shared DNA Segments in the range for cousinships at 3C and more distant. Why should we be frighted away when our Match falls into the small segment category?

My blog post about a Common Ancestor Spreadsheet (here), now has over 8,000 rows of Matches with Common Ancestors with me. I sort them to get “nested” family groups, and draw comfort as I see the closer families and note they are Shared Matches with each other. New ThruLines have been pouring in recently (and the quality is dropping off a little). As expected in my Common Ancestor spreadsheet, a majority are in the small segment range. I am not worried about the cM size as long as the genealogy is valid!  

Bottom line: Let the chips (small Shared cMs) fall where they may; and focus on the genealogy.

Happy New Year!

[22CE] Segment-ology: Let the Chips Fall Where They May TIDBIT by Jim Bartlett 20240101

Gold Stars

Featured

A Segment-ology TIDBIT

There are several key elements of good genetic genealogy – I’m going to call them Gold Stars.

1. DNA Match – as designated by the testing companies and GEDmatch. Most of these are our genetic cousins. I have a lot of them (over 120,000); and they are a good subset to work with. Worth a Star.

2. IBD Segment – We generally assume that virtually all Matches above 15cM have true genetic links; and my analysis is that about 66% of those 8 to 15cM are also true. Granted, some of the under-20cM Matches will be beyond a genealogy time frame (about 9 generations for me), IBD gets a Star.

3. Common Ancestor – This is a primary goal of genetic genealogy – finding a Common Ancestor with each Match. Notes: some Matches will have multiple CAs within a genealogy timeframe; just finding a CA does NOT necessarily mean that the Shared DNA segment came from that CA; a Match may share multiple DNA segments, and possibly multiple CAs. So finding a CA is worth a Star.

4. ThruLines (and Theory of Family Relativity) – I’ve found these to be over 90% correct. If you agree with them – add a Gold Star.

5. Same side – Ancestry and FamilyTreeDNA now indicate the “side” that each of our Matches is probably on. So far, I think this process is pretty accurate. The Common Ancestor should agree with the “side” for a Gold Star. If there is not agreement with the side, there may an additional Common Ancestor with the Match (on the same “side”]; or the “side” may be incorrect.

6. Paper Trail – each Common Ancestor should be supported by good genealogy paper trail of solid records. Not always possible; but add a Gold Star if you can document your and your Match’s paper trails.

7. Segment Triangulation – indicates your DNA segment is an IBD (true) shared segment; and probably the Matches’ segments are too. A Gold Star.

8. Shared Matches – [aka In Common With; Relatives in Common]. If most of the Shared Matches are in agreement, add a Gold Star.

9. Clustering – tends to group DNA Matches on an Ancestor. If the consensus of Matches in a Cluster is an Ancestor (or even 2 or 3 in an Ancestral line), add a Gold Star.

10. Reasonable Tree – does the Match with a Common Ancestor have a reasonable Tree? If a Match has a Tree with just one descendant (the Match’s Ancestor), that is a warning signal [NO Gold Star]. If a Match has a Tree with way too many children, given names repeated, different children with same birthdate – this is probably a research Tree with a collection every possible child – sometime born at many different locations – warning-warning! This is very flimsy evidence (NO Gold Star]. However, if the Match’s Ancestral line shows a reasonable number of children, spaced 1 to 3 years apart, that is a good sign. Alignment with census records is a plus. Use judgment to claim a Gold Star.  

Ideally, we’d have 10 Stars for each Match – but, that ain’t gonna happen very often… And I probably won’t be adding a Star # in my Notes. But I do review most of these when I accept a Match with a Common Ancestor. I just thought I’d share my compilation of thoughts when I find a CA.

This may be an imperfect list, but I hope it is helpful. Improvements/suggestions are welcomed in the comments. This Gold Star concept is not a set of hard rules – it’s intended to be helpful ideas. Your judgment should be the final say for your genealogy.

Note for genealogists – our genetic cousins are a small fraction of all our true cousins. I often add individuals to my Tree who are not DNA Matches.

[22CD] Segment-ology: Gold Stars TIDBIT by Jim Bartlett 20231229

Quandary

Featured

A Segment-ology TIDBIT

What if the genealogy is correct but the shared DNA is on the other side? Discard because the relationship is not from the Ancestor who passed down the DNA segment? Save because we are in fact real cousins, despite the DNA? Most of our real cousins beyond 3C won’t share enough DNA to be designated as a Match.

Same quandary with a Match sharing one DNA segment, but related two ways. Both ways cannot be through the same segment.

Now that Ancestry shows “sides” (Maternal/Paternal), I’m finding that some of the ThruLines are not on the same “side”.

Sometimes this happenstance leads to finding a genealogy error and/or finding another genealogy relationship which is compatible with the shared DNA segment – sometimes not.

With almost 50 years of genealogy research under my belt, I’m very reluctant to “discard” any true relationship. I worked for 35 years finding cousins before atDNA testing came along – I’m not going to trash tens of thousands of cousins just because they don’t share DNA with me. They certainly share Ancestry with me – and records and stories and friendships.

On the other hand, my current quest is a deep Chromosome Map – linking my DNA segments to my Ancestors. Sort of a “who is responsible” for each of my quirks. A relationship that is not based on a DNA segment, is a distraction at best… a wrong rabbit hole… a misdirection… an error!

I think the solution is to keep all the findings, but clearly mark the genetic genealogy ones.  What is your take? Please leave comments.

[22CC] Segment-ology: Quandary TIDBIT by Jim Bartlett 20231224

Go for the Triple Play!

Featured

A Segment-ology TIDBIT

When reviewing Ancestry ThruLines (or any potential Common Ancestor), go for the Triple Play!

Make sure the Common Ancestor AND the Side (Maternal/Paternal) AND the consensus of Shared Matches are all in agreement. If the CA is correct, they should be. Or at the least, there shouldn’t be a large conflict. I am finding a number of ThruLines under 15cM which do not agree with the Side. It is entirely possible to have a genealogy relationship (per ThruLInes) which is not the same as the genetic relationship (I believe most of the “Side” designations are valid). This would mean there is also another Common Ancestor that agrees with the Side – entirely possible for my Colonial Virginia ancestry. Or the Side could be wrong…

In any case, when you don’t have a Triple Play, it calls for some extra thought and/or research.

Just saying…

[22CB] Segment-ology: Go for the Triple Play! TIDBIT by Jim Bartlett 20231220

What Is the Next Segment?

Featured

A Segment-ology TIDBIT

A question recently came up: Are the Ancestors on two sides of a crossover point, always a mother and father (in either order)? Or: If I know the Common Ancestor (i.e. the father or the mother of the TG couple) of a TG segment, must the next TG segment be the other parent of the TG couple.? The answer is YES, with an important caveat: only when we are talking about mother and father of our Ancestor who created the crossover.

Important scientific fact: A crossover is formed when a human recombines two Chromosomes to create a new Chromosome that is then passed to a child. One of the two Chromosomes is from the Mother, and the other is from the Father. So one parent is on one side of each crossover, and the other parent is on the other side of the crossover.   Here is Figure 6 from my 2015 blogpost: Segments – Bottom Up:

Note: each of the Chr 05 lines above is your Maternal Chr 05 – it’s just broken down for each generation. In the Grandparent look, the two crossovers were created by the parent using grandparent segments (assuming an average of 2 crossovers per generation for Chr 05). Note the Ahnentafel numbers to represent generic ancestors – even numbers are males, odd numbers are females. The first crossover created by the parent shows 7 & 6, or female & male, on the two sides of the crossover. When the first grandparent segment ends at the crossover, the next segment is the opposite parent. The second crossover created by the parent has 6 & 7 (male & female) on the two sides of the crossover.

The next line – the Great grandparent look has 2 more crossovers – created by the grandparents, when each of them recombined their respective 2 Great grandparent chromosomes. One of the crossovers is between 14 & 15 and the other between 13 & 12 (there was no crossover when the Ancestor 14 segment was passed to daughter 7). So again, each new crossover has a male and a female (in some order) on the two sides of each crossover.

Check out the two crossovers (on average) added at each of the next two generations – they all have the mother on one side and the father on the other side of the crossover.  Note carefully the word “added” (or created or formed).

Now here is the catch… In the Great grandparent look above, the last crossover has 12 & 14 on each side – two males. This seems to contradict the basic concept. And if we were applying the basic concept to TGs at the Great grandparent level it would be wrong. What’s up? Well, what looks like a crossover between Ancestors 12 & 14 is in fact a crossover – but it was formed by Ancestor 3 when she recombined Chr 06s from her parents 6 and 7 – these are the two parents of the ancestor who first formed (or added or created) the crossover.

When we form Triangulated Groups (TGs), we use groups of overlapping segments. But there is nothing in the TG criteria about the generation of the TG. We do understand that the TGs start and end at crossover points – when we shift from one Ancestor’s DNA to another Ancestor’s DNA. But until we can Walk the Segments Back (generation by generation), we don’t know when the crossovers were formed. There is one generation for each crossover, but until we have Chromosome Mapping we don’t know which generation it is.

Note: A TG Summary Spreadsheet will give good clues to the formation of crossover points – see Observation 5 (see linked blogpost).  In generation after generation the older crossovers can be seen, with only about 2 new crossovers in each generation. So the farther back we go with Chromosome Mapping, the newly formed crossovers will be there (with mother and father on the two sides). But the other crossovers may not appear to be mother/father, unless the origin of the crossover can be determined.

Bottom Line: With TG segments, sometimes the next TG on a chromosome will be the other parent, but more often it will not.

Edit 20240403: It was suggested that I add a Chromosome Map, showing segments from my 16 2xG grandparents. Here is one I did in 2013:

[22CA] Segment-ology: What is the Next Segment? TIDBIT by Jim Bartlett 20231209

Consensus

Featured

A Segment-ology TIDBIT

I was adjudicating a ThruLines from a Common Ancestor (CA) down to a Match. The grandchild of the CA didn’t fit. I find about 5% of my ThruLines are wrong so I just dotted the Match yellow (TL Wrong) to add it to that group. But as I was about to close out the Match, I clicked on Shared Matches (which I usually do anyway). The Match was at 13cM so I didn’t expect much. Surprise – over 20 Shared Matches, and almost every one was confirmed or “likely” to be on the line indicated by ThruLines! A clear consensus. I went back to the Match’s line and found another path that worked – back another generation from the ThruLines CA hint!!

The details don’t matter. The moral of this story is that a ThruLines CA AND a consensus of Shared Matches AND the AncestryDNA “side” should all be in agreement. This applies to CAs at other companies, too – the clues should be in agreement.

Takeaways:

1. When you find a CA, be sure to also review the Shared Matches and the side.

2. When you are searching for a CA with a Match, review the Shared Matches first to see if there is a consensus clue.

PS: this assumes you have diligently done your homework and put all known or likely CAs in the appropriate Notes (same for every company).

[22BZ] Segment-ology: Concensus TIDBIT by Jim Bartlett 20231206