OceanofPDF - Com Python Experiments in Physics and Astronomy - Padraig Houlahan
OceanofPDF - Com Python Experiments in Physics and Astronomy - Padraig Houlahan
com
Python Experiments in Physics and
Astronomy
Python Experiments in Physics and Astronomy acts as a resource for science
and engineering students or faculty who would like to see how a diverse
selection of topics can be analyzed and simulated using Python programs.
The book also provides Python solutions that can be learned from and
modified as needed. The book is mainly aimed at undergraduates, but since
many science students and faculty have limited exposure to scientific
programming, having a collection of examples that address curve-fitting,
Fast Fourier Transforms, image photometry and image alignment, and many
others could be very helpful not just for learning from, but also to support
classroom projects and demonstrations.
Key Features:
OceanofPDF.com
Python Experiments in Physics and
Astronomy
Padraig Houlahan
OceanofPDF.com
Designed cover image: Padraig Houlahan
First edition published 2025
by CRC Press
2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431
and by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
CRC Press is an imprint of Taylor & Francis Group, LLC
© 2025 Padraig Houlahan
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of their
use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let
us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access
www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
ISBN: 978-1-032-98189-5 (hbk)
ISBN: 978-1-032-98699-9 (pbk)
ISBN: 978-1-003-60004-6 (ebk)
DOI: 10.1201/9781003600046
Typeset in Minion
by SPi Technologies India Pvt Ltd (Straive)
OceanofPDF.com
Contents
Preface
Introduction
Chapter 1 Python and Object-Oriented Design Notes
Chapter 2 Exploring Data
Chapter 3 Signals and Trends
Chapter 4 Gravity Fields and Mass Distributions
Chapter 5 Spiral Galaxies and Dark Matter
Chapter 6 Sampling a Distribution
Chapter 7 Projectiles – The German 88
Chapter 8 Rocket Launches
Chapter 9 Building a Star Catalog from an Image
Chapter 10 Photometry: Measuring Object Brightness
Chapter 11 Aligning Images and Finding Targets
Chapter 12 The Saha Equation and the Balmer Spectrum
Chapter 13 Isochrons – The Ages of Rocks
Appendices
Index
OceanofPDF.com
Preface
This work is a result of a desire, after I retired, to build a collection of
programming projects concerning topics I had encountered during my
college years, both as a student and teacher. The motivations were many. In
some cases, I wanted to see how some complex or abstract technical
problems were solved, such as modelling how hydrogen spectral lines
changed with temperature or how images of star fields could be aligned
automatically. In others, it was to explore real-world systems such as
projectile motion where the standard equations of motion breakdown at
high velocities because of drag or to see for myself how adding additional
mass (Dark Matter) to galaxies can account for observed rotation speeds.
Creating and designing models and simulations of phenomena in physics
and astronomy is not only a great way to deeply learn how the systems
work through implementing their underlying equations, but it is also a
terrific way to share knowledge with others. Like others I think of my own
experiences when either as a graduate student or researcher or a lecturer,
there were many times I wished I had decent and complete code examples
to refer to, to see how solutions were implemented; I am unashamedly a
‘monkey see, monkey do’ kind of learner. I also feel that whether it is for
boot-strapping a project or to support teaching presentations, such
simulations can play a vital role, allowing lecturers to assign projects where
students and faculty with limited programming experience and time can
take an existing simulation and modify it to suit; perhaps to explore a whole
new area by working with a different data column in a catalog or even a
different catalog altogether.
Obviously, there are limitations in what can be achieved in a single
volume. I have emphasized making code immediately available to the
reader, and to save space, I limit the normally expected in-code comments
and self-documentation and cut corners by not exhaustively testing and
hardening the code. My assumption is that those who would be interested in
this work will have enough of a computing background, even if meager, and
(most importantly) a willingness, to look at the code examples and figure
out how they work – I avoid clever and abstract computing techniques as
much as possible to encourage this reverse engineering. From my own
experience, both as a researcher and a teacher, I understand there are times
when researchers of all ages and faculty are at a fragile stage of learning
where they understand the broad strokes of computing but face an uphill
struggle to get over the next hump in the learning curve and can benefit
from good examples. The examples included here are intended for them.
They are not perfect, and they are deliberately primitive when needed. But
they are functional. And fun!
OceanofPDF.com
Introduction
This book explores a whimsical collection of topics from astronomy and
physics with a common underlying theme – when encountered in the
classroom, they often elicit an ‘How was that done?’ or ‘I wish I could see
that demonstrated!’ response. To this end, I address the topics by
constructing software that models the systems, based on their underlying
scientific concepts, because I believe relatively simple software models can
be of great benefit to students and faculty wrestling with learning and
teaching complex processes. Having access to the simulations’ source code,
let the user see for themselves how problems were solved, which allows
them to add additional code to reveal more details of interest.
Most topics here are in the realm of the applied, such as the various
astronomical image processing tasks we address, that is, photometry, image
alignment, and extracting object data from images; or for example,
modelling real-world artillery shell trajectories with ranges four or five
times smaller than the drag free models encountered in introductory
courses.
But even the applied can be very abstract. For example, a well-known
challenge in modelling is the creation of sample sets drawn from a
probability density function. How is this to be done? The answer of course
is to sample the corresponding Cumulative Density Function. While the
proof can be presented very succinctly, it can leave the student with a sense
of wondering if they’re missing something and wishing for tangible
demonstrations. We address this by exploring a variety of density functions
and showing how their CDFs can be created and then sampled, not just to
verify the procedure but to also show how to proceed when unique or non-
standard distributions must be addressed.
In a course on integral calculus, students usually learn how to calculate
the gravitational force for a sphere of radius R, with the perhaps surprising
result that the gravitational force at a distance r < R from the center only
depends on the mass inside of r; when r > R, it’s as if all the sphere’s mass
is concentrated at the center. By modelling mass distributions and adding up
the effects from all points in the distribution, we can demonstrate these
effects and explore if there are similar results for other geometries, such as
disks, cubes, and rings. The great thing about this kind of modelling is that
once you figure how to create the distribution, you can find the gravitational
effects at any location, not just along an axis of symmetry which simplifies
the analytic approach.
In astronomy courses, students are taught to account for galaxy rotation
curve behaviors, where the speeds of stars orbiting a spiral galaxy’s core
seem to level off with distance, instead of decreasing with distance. To
resolve this, additional mass must be added to the galaxy outer regions –
Dark Matter. Trying to analyze the gravity fields from mass distributions
involving spheres, disks, and shells from arbitrary directions is not always
easy and often will not result in elegant equations. Under these
circumstances a brute force mechanism where geometrical distributions can
be co-added and the effects of all particles added to determine the field at
the sampling locations of interest is easier; a graph of the result can be very
compelling for the student. Better yet, it is straightforward to switch from
gravitational forces to electrostatic ones, and a whole slew of charge
distributions can now be studied.
I also explore how hydrogen spectral lines are generated and create the
famous Balmer spectral series. The underlying mechanisms and
mathematics are complex – even for my primitive approach, but in showing
how we can estimate how the line intensity varies with temperature, and
how temperature influences which of the series will be the brightest, all the
while considering the degree the gas (hydrogen) is becoming ionized, the
reader can explore these effects for themselves and try other simple atoms
perhaps. There really is something wonderful after undertaking an in-depth
mathematical study of a process, being able to model it, and for the Balmer
series, not only do we do that, but we use our results to create simple
synthetic spectra, which I think brings a sense of completeness to the effort,
where the theoretical is contrasted with the observed graphically.
In assembling this diverse collection of topics, and in providing Python
classes and sample codes to explore them, I hope the reader will be
motivated to replicate some of the scenarios for themselves and learn from
seeing the solutions presented. The demonstration codes are very basic and
clean to make them as transparent and as brief as possible since I don’t want
the reader to feel like they are facing an intimidating wall of dense code.
My hope is the reader will appreciate being able to look at and study the
coding solutions I provide here and borrow from them and build upon them
for their purposes. I think there are many possible applications at the
college level where faculty and students could build on the examples for
end of course, or senior thesis projects; for projects which would be too
complex unless they could be jump-started by using demonstrations like the
ones here. Some of the examples here could serve as an independent
approach to cross-check other solutions, and to focus on a core process, to
see how it works. For example, there are freely available photometry
software packages that can be used to measure a star’s brightness. Some are
quite overwhelming in their complexity and appear to suffer from too much
building the new on top of the old, which introduces quirkiness and
instability. I would argue that when first encountering photometry, a simple,
transparent, application that could be easily explained, and modified as
needed, would be a powerful resource for both the teacher and the student.
At all times, I do keep in mind the benefits of giving the reader access to
model code, and hope to encourage them to either modify or develop their
own, since I strongly believe the process of developing and coding models
helps foster a better understanding and appreciation of the underlying
processes being investigated.
The demonstration Python code presented in this project was designed
mostly for clarity and less so to be efficient. There is an assumption that the
reader has sufficient background to be able to learn from the examples or is
willing to work toward that level of proficiency. To reduce the code
example sizes, they are not heavily commented, and key details are
described in the Programming Notes sections. Note also the examples are
often ‘bare bones’ and have not been developed to meet all possible
applications; they are for demonstration purposes intended to illustrate an
approach to solutions; hardening the code to be more robust would greatly
increase the size. Regardless, the examples still serve as a first step for the
reader to build upon.
Efficient code can be very cryptic to read and maintain and would defeat
our teaching purpose. For example, when manipulating class parameters in
a function, I will often do something like x = self.x, and y = self.y so I have
local copies of the class variables, and then write a much easier equation
like ‘z = x+y’ instead of ‘z = self.x + self.y.’ Often, I will, somewhat
inefficiently extract data columns, such as for coordinates, from a dataframe
just for the convenience of being able to use them directly in an equation
instead of a more cumbersome dataframe indexing form. It’s all a matter of
judgment and personal style. And for terminology, while there is a benefit
in referring to a Python class function as a ‘method,’ I prefer to use the term
‘function,’ probably because I look at the problem from a mathematical
perspective. And similarly for class attributes, I prefer to call them variables
or parameters.
Since a major goal for this book is to allow the reader to see how things
were done, this presents a dilemma in that while there is a need to show the
code used and to explain it in sufficient detail so a reader with modest
programming skills could understand what was being done, and this kind of
detailed discussion about the mechanics and techniques can distract from
the narrative. For this reason, I will regularly provide an overview of a
chapter’s underlying Python code before providing the complete listing, so
it is available for reference.
mM
F = G
2
r
To set the correct scaling for the force to match observation, when using
meters and kilograms, for distance and mass, respectively, a scaling
constant G = 6.6 x 10−11 is used; if using centimeters and grams, G would
be 6.6 x 10−8. Is there a system of units where G = 1? Sure. I could say if m
and M are the same and I call that mass 1, and they are separated by a
distance I will also call 1, then I now have a force F = 1 in that particular
scheme. Of course, if I want to compare my results with normal usage, I
will have to figure out how to convert them (inevitably using G). But here’s
the thing: No matter what system of units I use, for a given initial m, M, and
r, if I make m or M ten times larger, F will be ten times larger; if I make r
three times greater, F will become nine times smaller, in other words all
systems will have their forces change by the same multiplier.
We are, in a sense, treating the equation as having two parts: There is the
structural part with m, M, and r, which captures how gravity works, and
there is the scaling part. The structural part will always have the same
behavior, whether it’s for the gravitational attraction of an electron orbiting
a proton, or the Moon around the Earth, or the Sun orbiting the Galaxy:
Doubling the separation will reduce the effect by a factor of four and so on
– no matter what units are used. So, for clarity, and convenience, I will
sometimes omit the scaling constant and choose convenient units if I’m
mainly interested in seeing how a system behaves – mainly in seeing how
its plots look.
Code Examples
The code examples were written in Python, and use popular supporting
libraries such as Matplotlib, Pandas, Numpy, Math, Itertools, and Scikit. As
presented, they can be run from an IDE such as Spyder, but any IDE should
work. There is no reason they couldn’t be saved as standalone applications,
but this was not done here to avoid the headaches of saving code in a robust
form that would run on different operating systems and also because the
examples generally consist of a Python class developed for a particular task,
and a small test application that uses the class but with hardwired
parameters like filenames and number of input lines to read into the test
code. Some mechanisms, such as a configuration file or command line
argument reading capability, would need to be added.
I also chose to not write GUI/message-driven code in general. Elsewhere,
I have specifically written on how this could be done, but for the most part,
this keeps the focus on the project’s science and not the complex distraction
of managing GUI widgets and event-driven code. When dealing with a
project like this, where showing the code solutions is a major part of hope
for reader’s benefit, a balance must be struck between how much of the
code to explain and how much to show in a chapter; too much and the
science narrative suffers, too little and the coded solutions and purpose are
unclear. I have tried to strike a balance where in most cases, important code
sections are included in chapter, but complete listings are provided at the
chapter ends so the reader can quickly refer to the structure if they want.
In a work like this, the question arises as to how should code segments be
shown. Stylistically, I use two methods to show code examples. I generally
use a simple cut and paste from the IDE, and then use a terrific Microsoft
Word Add In called ‘Easy Code Formatter’ which will build a text-based
representation that can flow across page boundaries. A very nice feature is
that code listings can be split by selecting/creating a blank line and setting
the style to Normal for that line, which retains the formatted computer code
look-and-feel, enclosed by the accenting border. (Since both methods make
code examples available with their lines numbered, I will often simply refer
to the line number, and omit the figure number, for brevity.) There is no one
perfect solution, but these are useful options for readers who might need to
produce their own documents.
Since a major goal for this book is to make complete code examples
immediately available for the reader to peruse and study, I decided to
include the major code examples in each chapter, normally at the end, for
reference. While this can present the reader with a large block of code all at
once, I felt that efforts to subdivide code and discuss code subdivisions one
at a time actually made things more confusing. Simply referring to line
numbers within the one code block was clearer and more manageable. This
approach naturally led to adding reference sections in each chapter called
Programming Notes that discussed some of the more important features in
the code, without distracting for the chapter’s primary narrative.
Finally, I will normally use bold-face style when referring to code
elements (functions, classes, variables, equations), but leave mathematical
equations and entities in normal.
OceanofPDF.com
Chapter 1
Python and Object-Oriented
Design Notes
DOI: 10.1201/9781003600046-1
OOD Notes
For most of the code examples presented in this work, we rely heavily on
Python’s Object-Oriented Design (OOD) techniques. While OOD can be
quite nuanced when used to its full potential, it turns out even a little goes a
long way, and for the kind of scientific programming used here, being able
to use a few fundamental constructs allowed for the creation of code that
was quite manageable, easy to organize, and easy to apply. In this chapter,
we discuss some basic OOD concepts that were used repeatedly by the
different projects covered.
Using an OOD approach is really the way to properly manage your code.
You will likely find that once you start adding features and capabilities, you
will quickly end up with hundreds of lines of code in a file, and it can
become tedious to have to jump around when making small adjustments.
OOD’s class constructs will go a long way toward making your code
manageable and re-usable, but even then, you might be faced with bulky
files that are burdensome. One solution would be to add other classes or
sub-classes to your design, but if you are happy with the current design, but
simply want to de-bulk the file, a possible (but ugly) strategy would be to
place class methods in separate files and to import them; however, if
importing a collection of methods (defs) into a class, make sure the import
statement is in the class and everything is properly indented!
For some of our numerical experiments, we would like to run particular
models from a class of models. Since many models would have similarities,
they should have methods in common that could be stored in the parent
class. We would also like our models to be easily configurable, since we
don’t have the benefit of a dashboard with interfaces for sliders and buttons;
this can be done by using command line arguments.
We will now illustrate some of these concepts by developing a system
with a parent class and two subclasses. What we wish to demonstrate here
are how child objects inherit properties from their parents, if configured to
do so.
Figure 1.1 shows three classes, a parent, child1, and child2. Both child1
and child2 use the parent class. The parent class has a variable b set to 8,
and two functions, __init__() and pprint(). Its __init__() function will set
the values of self.a and self.x, with self.x depending on the value passed as
an argument.
Now let’s see what happens when we create a child1 object from the
console and test its variables (see Figure 1.3).
Our first attempt at creating object c1 failed (line 7), because its
initialization (Figure 1.1, line 13) expected an input variable.
Retrying with an argument specified (line 10) worked, but when we test
for variables, c1.a didn’t exist, but c1.x did (see lines 13 and 22.) To
understand what happened, remember a child class has access to the
parent’s functions and variables, and child1 sets self.x because its
__init__() function was automatically called; the parent’s __init__()
function was not called, and so self.a is undefined.
If we wish to use parent variables, their variables could be fixed, such as
having b=8 (line 2, Figure 1.1). But what if we wanted to use a function to
create a customizable plot layout that different child classes could use.
Perhaps different classes wanted different plot sizes. In this case, it would
be useful if the common design was maintained as a function in the parent
class, but we’d like the child to be able to specify the size.
This can be achieved by having child classes call their parent’s __init__()
function as demonstrated with the child2 class (Figure 1.1 line 17). The
command ‘super().__init__()’ invoked the parent’s __init__() function. To
test this, let’s create a child2 object and test its variables (see Figure 1.4).
Now we see self.a is defined for the child because the parent’s
initialization was explicitly done using the super() command. Note,
however, that while the current value of c2.x is 7, the parent class used a
value of 9 for self.x but then self.x was updated after the parent was
initialized.
It is also worth noting in our examples that when we create a child object,
there is no parent object created; a child object can have features and
capabilities associated with the parent class design, but there is no separate
parent object. We can create a parent object and did in Figure 1.2, but we
don’t need to. In our examples, objects p, c1, and c2 are different
independent entities and the only things they have in common are
definitions and design.
There is much more that could be said about OOD but what we have
covered here is really all we need for the projects covered in this book. Our
minimal usage is more than adequate for creating coding structures that are
manageable and easily changed.
A Few Python Tips
Python is probably the most popular language used by data scientists and is
obviously very powerful. This kind of power comes at a cost – complexity,
and a learning curve. For scientists who don’t necessarily have an extensive
programming or computer science coursework background, there are some
concepts and topics that are worth reviewing when dealing with the usual
data structures, such as time series data and arrays, and a simplified
summary of ones regularly encountered in our code examples will now be
presented.
Lists vs Arrays
Python has two kinds of structures for holding vectorial information: lists
and NUMPY arrays. In Figure 1.5, we see how lists can be created and
merged. Lists can also be indexed, so in this example, l3[2] is 3 – remember
Python indexes start from 0.
But there is another way to achieve this by using Numpy arrays. For
example, if we import the Numpy class as np using a command like
‘import numpy as np,’ we could create a numpy array and simply multiply
it by a multiplier. We can also combine numpy arrays mathematically.
Examples of these capabilities are shown in Figure 1.7, where from the
console, we create two lists, use them to create numpy arrays x and y, and
then add the elements using z = x+y. We can recover the list form using the
list() function.
Figure 1.7 Numpy arrays, unlike lists, allow for adding vectors in
a traditional sense.
The result is a repeated list – a very useful tool for creating lists based on
repeated patterns such as weekdays, to annual ones, for modelling sampling
distributions. Sometimes we only want to use part of an array. This can be
done using the ‘:’ operator as shown in Figure 1.10. On line 4, we take the
last two elements using ‘[-2:],’ all elements including and after element 2
‘[2:’], and elements between indexes 1 and 2 ‘[1:2].’ Note, that in Python
indexing, a range such as ‘3:9’ does not include the 9th element (see lines
10–11).
It can be very useful to be able to move back and forth between the
worlds of lists and arrays. Many times we can simply use lists, but if
working with data that is vectorial or matrix like, Numpy arrays are very
convenient – especially if Numpy mathematical functions are needed.
DataFrames
Pandas dataframes are two-dimensional matrices that can hold many
different kinds of data types. They allow data scientists organize and
manipulate multi-variate data. Newcomers to Python can be intimidated by
them, often because they are a ‘level above’ simple arrays and can have
non-numeric content. Some uses can rely on cryptic and dense syntax, but
for our purposes, a straightforward usage will be sufficient. In this section,
we will provide a summary of their main features and to serve as an initial
overview for the novice, and as a reminder for the researcher who only
deals with coding occasionally and can benefit from a short overview; and
in doing so, hopefully establish the baseline understanding necessary for
those wishing to adapt our later examples.
In Figure 1.11, a dataframe (df) is created one column at a time. Column
‘x’ is created from a list on line 5, and column ‘y’ from a list on line 6. Once
created, a dataframe column can be accessed using either df[‘x’], or, since
the column name is a simple name, by df.x
However, df.x is not a list as such and can be converted into a list using
df.x.tolist(). On line 9, a new column is added by adding the two original
columns together. We could have implemented this in a simpler fashion:
df[‘z’] = df.x + df.y. The resulting dataframe can be seen by typing df at
the console or using print(df) (see Figure 1.12).
Figure 1.12 Using the print() command to display the dataframe.
The first column without a header in Figure 1.12 is called the index and
can be listed using df.index.tolist(), which would return a list [0,1,2].
We have seen how to extract a column; how do we extract a row from a
dataframe? We can do this using iloc, as shown in Figure 1.13, where
df.iloc[1] extracts the row, and again, we can use tolist() to convert it into a
list.
The iloc function is very powerful, and its argument can use up to two,
comma separated lists used to find the dataframe elements that meet the
row and column selectors:
Remember, when using the ‘:’ selector, such as [2:8], only the
rows/columns from 2 through 7 are returned.
Sometimes we wish to extract values conditionally, for example, to find
rows in a dataframe based on column values. For example, if we wished to
extract rows in our dataframe here, where the y value was 5, we could do
this with df[df.y == 5].
To delete a column, such as column ‘z,’ do: del df[‘z’] (interestingly, del
df.z is not recommended).
We have barely scratched the surface here in reviewing dataframe basics,
but this is most of what we need for later applications.
Chapter Summary
With our brief demonstration code, we have shown how subclasses can use
resources from the parent, or customize them locally, and also how we can
invoke subclasses and pass them parameters using the command line. We
will use this kind of technique repeatedly throughout this book. It’s a very
primitive use of OOD but very effective for our purposes. A small amount
of OOD goes a long way. In the next chapter, we will follow this strategy to
allow us to build collections of models (subclasses) that can share functions
from the parent resource class. In doing so, we will have a solution, where
additional models could be easily included.
We also saw how we can have a choice between using lists and arrays,
and we often need to be able to change from one to the other, but lists and
arrays are very different. For vectorial problems, arrays should be used
since they allow scaling and offsets to be applied to the arrays. With
dataframe, we can be very functional and effective if we know the basic
rules we described here, in how to create dataframes from lists, how to
convert columns to lists, and how to extract data based on either integer
locations (iloc) or conditional matching. These tools will be relied on in
later chapters. In the next chapter, we will explore the first set of models
and simulations, where we will investigate gravity fields for particle
distributions we assemble.
OceanofPDF.com
Chapter 2
Exploring Data
DOI: 10.1201/9781003600046-2
Scientists work with data since without data, theories could not be tested
and revised. As we develop our understanding of science, knowledge of
theory must be matched with an appreciation of how to analyze data and
how to create models to test with the data. While this is the general theme
throughout this book, in this chapter, we will show how powerful even
simple polynomial equations can be for modelling fundamental theories and
will show how graphical presentations can lead to striking conclusions and
how they will serve as a good introduction to Python’s plotting capabilities.
Consider a very simple observation: The Earth orbits the Sun in about
365 days. We now know there is an equilibrium at play, between
gravitational forces, the masses of the Sun and Earth, the Earth’s velocity
(speed and direction), and the distance between them. But there are other
possible factors: their sizes and rotation speeds; their composition and
surface temperatures. Perhaps you think I’m over-stating things a little.
How could temperature and composition be factors? Well consider this,
because of the Sun’s high surface temperature and luminosity, it creates
pressure from photons and emits boiled off particles. As a result, a mylar
balloon placed at the Earth’s distance from the Sun will not orbit in 365
days; it will be pushed outward and might escape the Solar System because
of radiation pressure. So, gravity is not always the dominant force.
The point I’m making here is that real-world problems might have many
factors to consider, and the scientist or engineer needs to be able to decide
which are important and which are not, so the simplest possible descriptions
(laws) can be developed that explain the observations. Ultimately, they are
trying to identify patterns, relationships, between variables, and there is
absolutely no excuse for undertaking such a search without exploring how
various observations/variables interact, by examining plots of one against
another. If relationships are found (i.e., the plots are not just of randomly
scattered points), then a relationship is being shown between the variables;
better yet, would be to be able to say whether the relationship was universal
and relevant to all systems, and not the one being investigated.
3/2
P = a (2.2)
It is easier to understand his reasoning by looking at the data graphically.
In Figure 2.1 the top panel shows P vs a for the planets Mercury through
Saturn (Kepler wouldn’t have known about the others). The middle panel
shows a linear and quadratic curve drawn to the same x and y axis limits
which shows the planets operate between these two extremes. The third
panel shows the planets with the linear and quadratic curves, and a curve
with exponent 1.5 – which is seen to exactly match the planetary data, so
Kepler’s third law works!
If we didn’t know about Kepler’s third law, we could have used this
approach to discover it, by trying curves with different exponents until we
found the best one through trial and error. This is essentially what Kepler
did – he had no underlying theory of physics to justify his results.
It is worth noting that ‘coincidences’ should always get our attention.
Kepler found the period depended on a to the 1.5 power, not 1.49, or 1.47,
or 1.53. Is it a coincidence that the needed power was exactly 1.5 or
equivalently the ratio of two integers: 3/2? The answer is it is not a
coincidence, and while Kepler might have used 1.5 on the assumption it
was exactly right, strictly speaking this was supposition on his part. It was
only when Newton used his famous law of gravity when studying planetary
dynamics which resulted in a derivation of Kepler’s laws, was it known that
yes indeed, P depends on a to the power of 1.5. Why would the law of
gravity cause this? The answer is because Newton’s law of gravity itself
builds in an integer power of 2 by saying the gravitational force depends on
1/r2 which could be re-written as 4π/4πr2, and since the denominator is just
the area of a sphere that power of 2 is not an approximation; it is the area of
a sphere, which is a perfect power law, hence gravity depends exactly on
the power of 2.
When analyzing data, it is important to appreciate the role played by the
scales of things. In Figure 2.1, we used units where periods were in years
and distances in AUs. That was human bias. If we grew up on Jupiter, a
scientist there might want the period for Jupiter to be one. Would that make
any difference? This is easy to test, let’s rescale the X and Y axes by
constants B and A, respectively, so Equation 2.3 becomes:
1.5
AP = (Ba) (2.3)
and hence,
1.5
P =
B
a
1.5 (2.4)
A
So,
1.5
P = K a (2.5)
For the Solar System, because the Sun is so massive, the sum of a
planet’s mass and the Sun’s is essentially the same as the Sun’s, and in our
system of units using the year and the AU, the Sun’s mass is 1. This hidden
mass dependence needs to be taken into account, and for this reason, we
cannot simply combine systems with different central masses, because each
system has its own constant of proportionality, which depends on the
central mass (mainly) and the unit choices. This means that we can compare
different systems and show they all obey Kepler’s third law, if we divide the
(P, a) values for each system by any value pair from that system, since by
doing this, we divide out the constants of proportionality and the mass
terms in each system.
We will now demonstrate this by considering the four Galilean Moons,
named after Galileo who after observing them move, famously concluded
that not everything orbited the Earth – a devastating criticism of the widely
believed Geocentric Universe cosmology. Again, we can overlay the data
for the Galilean moons and the planets when each set is normalized to one
of its members to adjust for each system having a very different central
mass (see Figure 2.2), and because of this both sets will contain a common
point (1,1). Since both systems show obedience to the 1.5 power-law,
Kepler’s third law applies to both, as does Newton’s law of gravity.
Figure 2.2 P vs a for the planets (‘+’ markers), along with the 4
Galilean moons ‘o’ markers).
Because objects in the planetary (Solar System) and Jupiter orbiting
systems are each scaled by one member of each, their constants of
proportionality are set to one, and they all obey the same power-law rule.
There is another elegant way to view our data, namely by plotting log(P)
vs log(a) since on a log-log plot, data points obeying a power line fall on a
straight line, where the line’s slope equals the exponent. This is shown in
Figure 2.3 where the planets fall on one line, and the Galilean moons on
another, but their respective lines are parallel and have a slope of 1.5, again
showing planets and Jupiter’s satellites obey Kepler’s third law. Using this
technique, we don’t need to normalize the systems to take the mass effect
into account, all systems obeying the 1.5 power law will be parallel,
demonstrating the law’s universality.
Figure 2.3 Power law relationships become straight lines when
plotted on a log-log plot. Here, for the Galilean moons, and the
planets, their log(p) vs log(a) values are plotted. Because each
system obeys Kepler’s third law, both systems fall on straight
lines of slope 1.5.
Once run, an instance of the class is created (line 16) as bsc, and a
preview of the bsc.df dataframe can be displayed on the console with a
command like ‘print bsc.df’ or, use the IDE’s variable explorer to examine
its contents (as shown in Figure 2.8).
We only need the vmag, class, and parallax columns since these can be
used to estimate the absolute magnitude (M), the luminosity (L), and the
temperature (T), since an HR diagram is a plot of L (or M) against T (or
spectral type).
Knowing a star’s apparent magnitude vmag (or m) and parallax p, the
absolute magnitude is estimated from:
M = m– 5 log10 (d/10)
(2.7)
The code relies on the PANDAS library and on its dataframe structures to
support data input and management. Among the key dataframe features
used were:
Reading the csv data file into a local dataframe (line 15)
Excluding/filtering dataframe rows (lines 16 and 91)
Renaming dataframe column names (line 17)
Extracting a dataframe column into a list (line 20)
Adding a list as a new column in a dataframe (line 23)
The class is initialized using the usual __init__(self) function and creates
lists for variables to be used, and also sets the colors and markers for
plotting the luminosity classes (lines 11–12).
The star catalog is imported by the read_bsc function, and at line 21, a
distance column built from parallax using the formula d=1/p. Note the very
useful and compact strategy for operating on a list where we calculate a list
of distances from a list of parallax values: d = [1/p0 for p0 in p].
New columns are created in the dataframe for absolute magnitude,
luminosity, and estimated temperature by function get_MLT_values().
Spectral subclasses are mapped to temperatures (line 26) by creating a
dictionary by function build_spectral_type_temperature_dictionary().
The get_star_spectral_type_and_class function parses the input
spectrum string information into new columns for subclass (i.e., B5) and
luminosity class.
Plots were created by adding in one luminosity class at a time (see lines
97 and 120), selected using the get_stars_in_lum_class function, because
this offered more flexibility in selecting colors and marker symbols to
emphasize the different classes. Note, when specifying markers, the size
parameter in the plt.scatter functions was set to 10; going much smaller
tends to turn all marker shapes into dots. The functions to generate the
specific plots were make_HR_Diagram and make_m_vs_T_diagram.
The class is contained in the file bsc.py, and running this file will execute
the example code after the line ‘if __name__ == ‘__main__’ (see line 143)
and produce the above plots.
The data column names are easily understood; there are columns about
the binary system themselves (e.g., names, periods, references), but we are
interested in properties of the individual stars. Physical properties are
generally of a form like ‘logM1’ or ‘logL2,’ in the first instance, the log of
the first star’s mass, and in the second instance, the log of the 2nd star’s
luminosity. Some column headers end with an ‘e’ (e.g., log2Le) to indicate
the measurement error.
There are about 313 binaries in the catalog, which means there are more
than 600 individual stars with L, M, T, and R measurements available to
use. It is important to note that this catalog is built from the efforts of
astronomers taking the time to study how each binary system behaves over
time, and then undertaking some spectroscopic analysis or estimate, and
measuring star sizes and masses; a very laborious and time-consuming
effort, which is why it is small compared to other catalogs with hundreds of
millions of entries Also, because much of the particular information must be
derived from different kinds of observations, there are gaps that have yet to
be filled, which means not all entries can be used; but on the other hand, the
catalog is constantly growing and improving with time.
Since we are interested in individual star properties, we do not need star
properties grouped by binary membership and will avoid that burden by
merging data imported into our program so we will add column logL2 to
the end of column logL1 and simply refer to it as logL, and similarly for
logM, logR, logT, and spectral type (SpT).
DEBCat therefore gives us access to a collection of fundamental physical
properties: L, M, R, and T, along with an assigned spectral type. We know
that a sphere of surface temperature T and radius R will have a luminosity L
given by:
2 4
L = 4πR σT (2.8)
That is, the area of the sphere times the energy released by each square
meter at temperature T. How do our stars match up with this very simple
model? How does star mass affect temperature? Luminosity? Does radius
change with temperature? These are questions we can now explore using
this data, but first, it is important to remember that astronomers have
identified different types of stars, different luminosity classes (I–V) which
show whether they are on the Main Sequence or are giants or supergiants.
We will differentiate among these luminosity classes using colors and
markers in our plots. (Do not be confused by this choice of color coding
scheme which is intended to differentiate among the different luminosity
classes; in reality, there will be stars of different colors from red to blue
among class V type stars even though class V stars are drawn using blue
dots in Figure 2.13, Figure 2.14 and Figure 2.15).
Figure 2.13 The HR diagram. The Main Sequence stars appear
well differentiated from the giants and supergiants. The slope here
is easily seen to be about 6, which means the radius must be
growing almost linearly with temperature for Main Sequence
stars.
Figure 2.14 Plotting log(L) against log(M) shows a clear and
well-defined relationship for the Main Sequence stars.
Figure 2.15 A plot of log(R) against log(T) shows Main Sequence
stars grow almost linearly with temperature.
Our problem then is of how are we to extract luminosity class data from
these spectrum description strings? Our solution was to use the Python Re
library’s findall() function with regex string matching specifiers as follows:
The key parts here are the () terms shown in boldface which attempt to
breakdown descriptors in a format like XY.Y_Zm into five character
groups(X)(Y.Y)(_)(Z)(m).
([A-Z]): this matches a single uppercase letter for the spectral classes like
O, B, A…
([O0-9.]{0,3}): matches groups of numbers and decimal points; there
might be 0 or 3 of them (the curly brackets adds this capability), so this set
matches subclasses 3, 3.5, or even a missing specifier. We included an ‘O’
(uppercase letter ‘o’) since we noticed a typo in one instance where ‘o’ was
used instead of zero – it is easy for a human curator to make errors like this
or to copy an original source error, and so this is a useful work around.
([_]{0,1}) Here the matching allows for the presence (or absence) of an
underscore character. Again, some descriptors include it; others not.
(IV|V?I{0,3}) This is the part that searches for the luminosity class
roman numerals. It reads as follows: accept either IV, or V, or 0-3 I
characters.
(.*) This catches any remaining character.
The command re.findall(), with these matching specifiers returns 5
parameters (see line 39) and the luminosity class is contained in the fourth.
As with the K3L class, plots are created by adding stars by luminosity
class which allows us to assign markers and colors by class.
For color choices, beyond the simple single-letter base color specifiers
(e.g., ‘r’ for ‘red’), other color options can be used such as tableau
(‘tab.orange’) – see line 12.
Note also entries without spectral type information were excluded since
luminosity was of prime interest, but this restriction could be usefully
relaxed. Finally, if other relationships are of interest, then a more general
plotting function of the form ‘plot_logX_vs_logY’ could be developed
based on the existing ones, instead of our customized solutions for logL vs
logT, etc.
Summary
In this chapter, we saw how powerful and compelling decent graphical
representations of data can be for identifying relationships among different
variables, and for processes that have power-law behaviors, using log-log
plots can demonstrate their presence and magnitude. A more advanced
approach would use curve fitting techniques to find best match models, but
we were still able to reveal behaviors and underlying astrophysical rules
governing fundamental star properties. We also saw how to download and
import catalogs into dataframes, and then separate out the variables of
interest. Because different researchers used slightly different notations
when categorizing stars, we found the character pattern matching re.findall
function could be used to parse the different notations to produce a
consistent one for our use.
In the next chapter, we will consider the problem of how to detect
periodic signals embedded in data and show how the Fast Fourier
Transform can be used to identify such signals so essential data can be
succinctly summarized.
OceanofPDF.com
Chapter 3
Signals and Trends
DOI: 10.1201/9781003600046-3
In Figure 3.1, the sine wave clearly dominates, is easily detected and
revealed by the power spectrum. In our scenarios, we used a period of p =
48 steps. Because our FFT function only operates on an input array, it is up
to the code developer to provide the interpretation of what units of time and
frequency are in use. In our models, when we invoked a scenario, we also
specified a timestep size of 1/24, so in real-time units, the period is 1/24 *
48 = 2. Hence the frequency is 1/p = 0.5, and indeed our power spectra
show a peak there. (We chose 1/24 as the step size instead of 1, to both
demonstrate how to use the step size parameter, and also because when we
will look at the www.met.ie data, the measurements we use will be hourly,
so a step size of 1/24 produces frequencies of 1/day.)
In producing our charts, we used a log scaling on the y-axis because very
often there can be a very wide range of spectrum peak heights, and the log
scaling compresses the y-scale.
In Figure 3.3, the charts from Figure 3.2 are recreated using a linear
scaling, and the peak at 0.5 is very clear. The lesson here is important: small
peaks in log plots can be very significant!
Figure 3.3 The same results as from those used in Figure 3.2 but
without the log scaling on the y-axis.
Figure 3.7 Shows a simple script to display rain data from 2000–
2022, using log scaling.
Figure 3.8 Charts produced by the script in Figure 3.7.
Because the data length is so long, only the last 365 days in the selected
data is used for the top plot, to give an overall sense of annual behaviors.
Because the rain is so persistent throughout the year, linear scaling was used
because log scaling suppressed the peaks’ visual impression. The results are
very interesting. Certainly, there are periods (gaps) in the rain, but it looks
like rain is a possibility at any time of the year, perhaps a little less so in the
summer. The largest peaks in the spectrum correspond to an annual trend (f
~ .003), a twice daily cycle (f = 2). Interpreting other peaks is best left to
meteorologists, but we can at least also note there is also a broad
distribution of energy over a wide range of frequencies, a characteristic of
turbulence/chaos.
What about temperature? The results are dramatic and are shown in
Figure 3.9.
Figure 3.9 Results from 2015–2022 temperature measurements.
The top panel shows a strong annual cycle, and the bottom panel
shows two dominant frequencies, annual and daily.
To build the chart and detect the trend, the data was first smoothed by
applying an averaging filter (window) of size 365*24 to span a year’s worth
of data, so in the smoothed data array, the value at position i is the average
of all the data between i-365*24 and i which suppressed daily and seasonal
effects. This asymmetric averaging was done to simplify finding a linear fit,
that is, allowing it to be applied in the range [365*24, N].
With the smoothed data, the Numpy polynomial fitting routine could now
be applied and looks like: c1, c2 = np.polyfit(xvals, yvals, n). In polyfit, n
is the order of the polynomial (we used n = 1 to obtain a linear fit), and the
function returned the slope (c1) and intercept (c0). Using n > 1 would
produce higher order polynomial fits and a longer list of coefficients.
GM
F = m = mg
(4.1)
2
r
Our interest in the disk and the ring distributions arises from the fact that
they are distributions of interest in electrostatics, and also because in
astronomy, many galaxies, planets, and protostars, have disk-like structures.
From calculus, we learn that on the outside, the gravitational field of a
uniform spherical shell is the same as if all the shell’s mass was
concentrated at the center. Furthermore, when passing through the shell, the
gravitational force disappears. Since a uniform solid sphere is simply a
collection of shells, we can now say that outside a ball, the gravitational
field acts as if all the ball’s mass was at a point at the center, but interior to
the ball, at any point, the exterior shell has zero net effect, and only the
mass interior to that point’s radius counts, and it also behaves as if it is
concentrated at the center.
One question we will explore for all these models is to what extent do
their gravitational fields behave like that of a shell or a sphere? Do their
gravitational fields obtained from adding in the effects of all points match
the ideal’s (all interior/enclosed mass is at the center)? To answer this, we
include the ideal plots which are generated as if only the mass interior to a
sampling point was relevant and that interior mass was concentrated at the
center.
Our strategy will be to write functions to distribute particles according to
the geometry we are interested in, which will result in a dataset consisting
of the x, y, and z positions for each particle. Then, for a set of positions
along a specified radial-axis of the system, we will calculate the distance (s)
from each position to each particle, and then the total gravitational force is
calculated at each sampling position. On completion, we will have a set of
net gravitational effects for each sampled position: we can then plot the
field strength versus the radial-axis position.
We will specify an observer location for each model that will define the
radial being sampled and which will allow our models to be used not just
for the normal axes of symmetry typically used in textbooks, but also others
that we can explore.
Our simulation software system consists of six files:
gsims.py to manage and invoke the desired model type. It imports the
class definitions from the individual model class files.
grav_sim.py holds the class definition for the parent class (grav_sim).
class_ring.py holds the definition for the ring_model subclass.
class_disk.py holds the definition for the disk_model subclass.
clad_single_shell.py holds the definition for the single shell subclass.
class_double_shell.py holds the definition for the double shell model
subclass.
class_sphere.py holds the definition for the sphere’s subclass.
All subclasses have the same structure and contain functions to build the
distribution, get the ideal radial profile, and create a two-panel plot: one
showing the 3-D distribution, and the other the numerical and the ideal
radial profiles.
Measurements are taken along a radial from the origin out to a point
self.Obs, defined by [Rx, Ry, Rz] which represents the tip of the sampling
axis; [x, y, z] are the coordinates of the model’s mass points; g is the
calculated gravitational fields; and dlist are the sampling distances.
At line 27 (get_radial_gravitational_field), the gravitational effects of
the models’ mass distributions are calculated.
Each measurement involves selecting a position d along the sampling
radial (line 40), and for each mass point’s position X, finding the vector
from X to d (i.e., s = d – X) (see lines 44–52) and using that to get the
gravitational field (line 59). The dot product of the unit vectors for s and d
gives the component of g along the sampling radial (lines 58 and 60).
Only the radial components of the fields are calculated because our plans
are to explore symmetric distributions with sampling radials along either
the x-, y-, or z-axis, even though we have generalized so the radial could be
in any direction. Studying fields transverse to the radial sampling would
require some additional code changes.
The parent class (grav_sim) uses six model command line arguments in
its __init__() function – which are used by all models.
Note, on line 55, we ignore any point within a minimum distance of the
sampling point to prevent divide by zero kinds of instabilities. If the mass
distribution is intended to model a continuum, then ignoring the nearest one
is probably okay since in any case, a continuum wouldn’t have a
concentrated point. The minimum distance used here was determined
through trial and error.
It is important to appreciate the get_radial_gravitational_field() method
works for any supplied distribution (the [x, y, z] lists for the mass points).
It’s where the bulk of the model number-crunching occurs and is properly
embedded in the parent class as a resource to be used by any model.
Function get_ext_inv_r_squared_field() calculates the ideal field
outside the main mass distribution(s), without having to get the distances
between the sampling position and the individual masses since it assumes
all mass is concentrated at the origin.
With the double-shell model we will explore later, we will calculate the
ideal twice: once using both shell masses and also for the inner shell only.
The idealized traces are stored in trace1 and trace2 with dlist1 and dlist2
being their corresponding positions.
The remaining method (line 83) offered by the parent support plotting
roles is a small utility used to create a plotting label string.
Because all our model class designs have a very similar structure, we will
now provide a more detailed discussion for the single shell model one and
be briefer with the others – shown at the chapter’s end.
Class single_shell_model Programming
Notes
In Figure 4.3, we see the single_shell_model class definition. This class is a
subclass of the parent (grav_sim) class because it invokes it as an argument
(see line 5). Notice how while the single_shell_model class has its own
__init__() method, it uses the parent’s to initialize variables used by all
models (see line 8). As part of its initialization, it automatically runs the
model by invoking the do_single_shell_system() (line 9). Note also, we did
not need to instantiate the parent class; having access to its definitions was
sufficient.
Figure 4.3 The single_shell_model class.
All our models are run when their subclasses are created and initialized,
and running a model has the same procedure in all cases: add points to the
desired mass distribution, get the radial gravitational fields, get the ideal
fields, and build the two-panel plot.
The self.add_points_to_shell() function used to add points to the model
uses a clever algorithm (found on the stackoverflow.com website) for
distributing points over a shell, reasonably evenly separated, and provides
the [self.x, self.y, self.z] data needed for calculating gravitational fields.
(Each model class has its own customized function to build its mass
distribution geometry.) The class has an appropriate wrapper to get the ideal
field (line 40), and it has the instructions to assemble the two-panel plot.
Note the plotting procedure use methods from the parent class,
automatically available to the subclass.
Overall though, the model has worked very well and has demonstrated
the ideal curve based on assuming the enclosed mass is concentrated at the
center, really works, and also that the gravitational field is indeed zero on
the inside.
Note, we did benefit from not approaching any particle too closely. If we
changed the sampling axis to lie along the y-axis, a slightly different result
is obtained (see Figure 4.5).
Figure 4.5 Sampling along the y-axis enabled us to straddle a
particle that exerted excessive pulls inward and outward.
Otherwise, the behavior is very close to the ideal.
Figure 4.6 Double shell model results with 500 points divided
between two shell of radius 1 and 3. The ideal curves (blue dots)
show the 250/r2 and 500/r2 trends consistent with calculus.
Not surprisingly, there are small sampling effect consequences near the
shell boundaries, but the major conclusion is that indeed, when working
with uniform shells, the results from calculus hold true.
For the disk models, there were many times we sampled too close to
some particles when approaching along the plane of the disk, and they had
undue influence if closer than 4*res so points closer than this were
excluded.
The results of x-axis and z-axis sampling are shown in Figure 4.9. For
comparison, Figure 4.10 shows the results where the sampling vector lies
along the z-axis.
Figure 4.9 Approaching the disk center along the x-axis can
produce noisy results from getting too close to individual
particles.
Summary
In this chapter, we modeled various particle spatial distributions to study
their gravitational effects. Using them we could verify properties known
from calculus and test models against ideal behaviors. But there are many
we didn’t attempt such as cubes, cylinders, and lines of particles, which
would be easy extensions. Most importantly, we found critical models
involving shells and spheres matched results expected from theory, which is
not only useful, but supportive of insight.
Most of the models could be assembled into more complex ones – in fact,
as we will see in the next chapter, we can model a spiral galaxy’s structure
by combining spherical and disk distributions. While we used models where
sampling was done along axes of symmetry, which would usually be
amenable to algebraic analysis, measurements along non-symmetric axes
could be easily done, even though otherwise, they would probably be much
more difficult, without easy analytic solutions. And of course, with simple
modification, by allowing particles to have an electrical charge, many of the
models could be used for studying charge distributions also.
OceanofPDF.com
Chapter 5
Spiral Galaxies and Dark Matter
DOI: 10.1201/9781003600046-5
When astronomers study spiral galaxies, one thing they measure is their
rotation curves which show how fast the stars are moving at various
distances out from the center. For example, on his website
(https://w.astro.berkeley.edu/~mwhite/darkmatter/rotcurve.html), Professor
Martin White includes a figure from the study by Begeman (1989) showing
the rotation curve for galaxy NGC3198 (see Figure 5.1).
Figure 5.1 The rotation curve for galaxy NGC3198 shows how
fast the stars are moving (orbiting the center) at various distances
out from the center.
Where things get really interesting is that when we model a spiral galaxy
as having a core and a disk, this is not sufficient to explain the observations,
there needs to be a third invisible component surrounding the whole galaxy
which we call ‘dark matter.’
We will now adapt the tools developed when studying mass distributions
to explore galaxy rotation curves, by trying galaxy models with spherical
cores and a disk, and then seeing if the resulting rotation curves can be
adjusted as needed using a halo of matter mimicking dark matter, to give
results similar to the observed rotation curves.
Because we are considering one particular class of galaxy models (core,
disk, and halo) and because we will be considering velocity and not the
gravitational field, we will create a new class and calculate orbital velocity
instead of gravitational force, with the following elements:
1. There will be a new controlling class, class_spiral_galaxy_v.py instead
of grav_sim.py.
2. The class will be self-contained and not use a parent class for
simplicity.
3. The mass will be specified separately for the core, disk, and halo.
4. We will estimate the velocities along the x-axis for simplicity.
5. The velocity at a distance r from the center will be based on the normal
rule for circular orbits:
GM (5.1)
v = √
R
GM RGM (5.2)
v = √ = √ = √ RF
2
R R
In this form, we see the velocity is the square root of R times the
gravitational field (force per unit mass). This means the effect of the mass
distribution M creates a resulting gravitational field F at a distance R, and
this is what’s being balanced against the centripetal force to create the orbit.
So, instead of using M to directly calculate v using Equation 5.1, we will,
just as we did in the previous chapter’s models, use M to calculate F, and
from that we calculate v.
Our goals then are to achieve the following: create spiral galaxy mass
distributions and to calculate their rotation curves. The mass distributions
will be based on a spherical distribution at the center representing the
galaxy core; a disk of material to represent the galaxy’s disk; and a halo of
material surrounding the core and disk. The core and disk will represent
what we normally see, and the halo will be the extra mass needed to
account for typical observed rotation curves. Our halo will be a shell with
an inner and outer radius because we expect matter on the interior to have
condensed into the core and disk, so by having an inner and an outer radius,
we can explore the effects of different halo sizes and thicknesses.
Our code is contained in two files: class_spiral_galaxy_v.py and
run_models.py. class_spiral_galaxy_v.py contains the formal definition
for the class we created – spiral_galaxy_model_v(), and its details are
discussed below. There you will find the complete code and detailed notes
on how the code functions, such as how the rotational speeds were actually
calculated (implemented), the galaxy models built, and how the graphics
output was done using matplotlib plot libraries.
sv = spiral _ galaxy _ model _ v(2, 8, 30, 40, 200, 400, 3000, 50, 0, 0)
Sets R1–R4 as 2, 8, 30, and 40; M1, M2 and M3 as 200, 400, and 3000,
respectively; and sets the end of the sampling radial on the x-axis (50, 0, 0).
When running a simulation, through invoking instantiations of the
spiral_galaxy_model_v class, the following happens:
Parameters R1–R4 are used to set the sizes of the different galaxy
components: the core radius, the outer radius of the disk, and the inner
and outer radii of the halo.
Parameters M1, M2, and M3 set the masses (number of points) in each
of the components.
Parameters Rx, Ry, and Rz set the end point of the sampling radial. We
use about 50 steps along the radial for the sampling points.
Points – (x,y,z) coordinates – are generated for each of the components
– there are M1 of them for the core, etc. so the model has a total of
M1+M2+M3 points.
For every sampling point on the sampling radial, the gravitational field
there is calculated from all points in the galaxy and from that the
rotation speed found using Equation 5.2. However, to avoid numerical
instabilities, points very close to the sampling point are ignored.
Ideal traces, where the enclosed (interior) mass is assumed to be at the
center, are calculated for the core, the core and disk, and the whole
system. In each case, the largest scale is used, R1, R2, and R4. The
corresponding enclosed masses would be M1, M1+M2, and
M1+M2+M3, respectively.
A two-panel plot is created with a 3-D image of the galaxy on the left,
and on the right, estimated and ideal traces for the rotation speeds
displayed.
Figure 5.3 Rotation speeds when the disk equals the core mass.
The red curve is computed from the mass distribution. Blue dots
show the theoretical for the core and gold dots for the disk and
core. Since rotation speed scales as the square root of the
enclosed mass, outside the disk, the combined disk and core
results are 1.41 (square root of two) that of the core only results.
Figure 5.4 Increasing the disk mass to three times the core
increases overall rotation speeds out to the disk edge, when the
expected fall off occurs. Since the total enclosed mass outside the
disk is four times that of the core, the speeds are twice that of just
the core mass.
In the equal mass scenario, the combined mass is twice that of the core,
so the corresponding rotation speeds are root-two (1.41) times greater.
When the disk is three times the core, the combined mass is four times that
for the core, and rotation speed scales as a factor of two, so, the exterior
rotation speeds (gold dots) are generally twice that of the core’s (blue dots).
Figures 5.3 and 5.4 also show that outside the disk, the whole system
behaves as if all the mass is effectively at the center, since it so closely
follows the ideal traces – consistent with our experiments in the previous
chapter. Note, the core+disk+halo curve mostly overlaps that for the
core+disk in these plots.
These scenarios also show that while we can add more mass to the disk,
the disk cannot produce a flattened rotation curve to the outside. More mass
must be added outside the disk.
While it wouldn’t be consistent with our galaxy’s disk, a galaxy with a
disk 25 times the core’s size, would produce a relatively flat rotation curve
as shown in Figure 5.5, where the disk was increased from 15 to 25 times
the size of the core.
Figure 5.5 Spreading the disk mass over a larger disk of size 25
does flatten our rotation curve so this might work for some
galaxies but would be inconsistent with our galaxy’s disk size.
So, if we want to flatten the rotation curve beyond the disk, we need to
add additional mass over and beyond what we see in the form of the core
and disk – called ‘dark matter.’ Figure 5.6 shows a scenario where
additional mass is added overlapping the disk (R3 = 10, R4 = 40) and
flattens the rotation curve, giving a more uniform overall appearance.
Figure 5.6 With a thicker dark matter halo extending from 10 to
40, overlapping the disk significantly, the rotation curve is
smoother and flatter.
By being able to model core, disk, and halos with different masses and
sizes, these models give much room to match observed rotation curves, but
it must be remembered that models should be based on observed
characteristics. In all cases, once we reach the edge of a system, the rotation
curve must take on the ideal form, and if it doesn’t, then that indicates there
is still more mass unaccounted for.
For completeness, the above charts were produced by the run_models.py
file shown in Figure 5.7.
In this chapter, we will explore some techniques that can be used to create
samples based on probability distributions that are not necessarily uniform.
For example, detailed studies of globular clusters, such as M13 shown in
Figure 6.1, show that their density profiles are not typical, and not
Gaussian, and require the researcher develop new, highly customized
models. We provide an overview of essential concepts will be provided, and
standard models developed to show how they function and to help the
reader develop their own versions to learn from. For us, a model will refer
to a system’s Probability Distribution Function (PDF). Once we have our
models, we can then calculate the Cumulative Distribution Functions
(CDFs), the inverse CDFs, and demonstrate how to use the inverse CDFs to
generate a sample consistent with the initial PDF.
Figure 6.1 M13, a famous globular cluster in the constellation
Hercules. Such clusters can have hundreds of thousands of stars
with spatial density distributions that are not normally
encountered in textbooks, and the usual tools for creating
Gaussian or exponential distributions are inadequate.
X = G(U)
(6.1)
X < x
−1
=> U < G (x)
But the left side is just F(x) – the CDF – and for a uniform distribution,
since Pr(A < B) = B, the right side is G−1(x), therefore
−1
F(x) = G (x) (6.2)
The plots in Figure 6.2 show the Gaussian PDF in the top left corner, and
its CDF in the top right. The inverted CDF (i.e., CDF−1) is shown in the
bottom left, and a histogram of samples generated from sampling CDF−1 in
the lower right. Note how the inverted CDF plot has unevenly spaced X-
axis values. This is remedied by resampling, and the result shown as the red
overlay. It is also worth noting with the PDF plot, that for each x, there is a
unique p(x), but this not true for the inverse, most p(x) values have two
possible x values, which makes it difficult to use p(x) directly for creating a
sample of x.
Class model_dist’s functionality breaks down into the following areas:
For a selected model, the mode (model) is set (see line 7) and the
initialization builds the corresponding PDF, CDF, and CDF−1 arrays with
the make_pdf, make_cdf, and make_inv_cdf functions. After this, it
creates a sample set by sampling the inverse CDF array (get_samples) and
generates the chart output (make_four_panel_chart).
Note that once a model is specified and its PDF is created, all other
functions are generic in the sense they work for all models/PDFs. This
means plots can be readily produced for empirically based PDFs and other
non-standard distributions.
In running a model, a variety of arrays are produced with self-
explanatory names: self.x, self.pdf and self.cdf; inverse mapping is simply
done by reversing the self.x and self.cdf arrays when necessary. As
discussed later, self.xr and self.yr are also created based on resampling of
the self.x and self.cdf arrays because of uneven spacing/resolution
introducing unwanted granularity.
A critical aspect of the make_inv_cdf function is the order in how it
passes self.cdf and self.x to the self.resample() function, that is, it passes in
the arrays as (cdf, x) not (x, cdf), and this reversal is how the CDF inversion
is achieved.
We will now look at the results for the different distributions to see how
the inverse sampling processes work.
Uniform Distribution
With a uniform distribution, (see Figure 6.4), all outcomes (events) are
equally likely, so a random variable with values in [0, 1] would have a
uniform distribution if its PDF was of the form p(x) = K, where K is a
constant. In this case, K = 1, so the area under p(x) is also 1. The histogram
of the generated samples is shown, which is flat – consistent with the PDF.
PDF values were calculated for evenly spaced x values between [0, 1], so
p(x) is always one. CDF(x) is the area under p(x) between 0 and x, and
hence is proportional to x, resulting in a straight line. Note the maximum
value CDF(1) = 1. The histogram of the sample values used 20 bins for the
1000 points used in sampling the random variable which are distributed
fairly evenly.
While not needed for the Uniform PDF, all models used resampling to
handle uneven spacing of CDF values. The resampling function
(self.resample) works by mapping the unevenly separated x values onto a
larger number of evenly spaced ones defined over the same interval and
linearly estimating the new y-values. For example, if the original data had
two consecutive points (x1, y1) and (x2, y2), and then an oversampling
point between them (x0,y0) would have a y-value, y0 = y1 + f*(y2−y1),
where f was (x0 – x1)/(x2−x1). In other words, whatever fraction of the
way x0 was in the x interval, so also was the y0 value in the y interval. The
results of the resampling are shown as the red dots in the lower left chart
panels and are seen to fill in the gaps in the original CDF values.
Now that we have our inverse CDF in the form of xr and yr, it is easy to
generate the samples consistent with the initial PDF using the
self.get_samples function (see Figure 6.5).
Analytically, the CDF for the linearly increasing pdf should scale as the
integral of x (i.e., ½ x2) and for the linearly decreasing as the integral of
(1−x) (i.e., x – ½ x2). In both cases, the CDF is not evenly spaced which is
why we used resampling.
However, the red dots for the resampling look quite effective and match
the raw inverse CDFs well, and the resampled inverse CDFs were used to
create the samples.
Both the quadratic and Gaussian results are as expected, with the latter
being an extreme example of uneven spacing needing resampling.
Summary
In this chapter, we explored the problem of how to generate a sample from a
distribution and illustrated the solution by applying it to a collection of
standard forms (PDFs). The demonstration code could be modified and
improved upon, but is functional, and most interestingly, the code could be
used with non-standard PDFs, perhaps from lab experiments, which are not
simple linear, quadratic, or Gaussian types.
The models showed how sampling an inverse CDF could produce a
sample set consistent with the specified PDF, and the results were very
reasonable and informative. One of the nice features of the code shown was
that once a PDF was defined (i.e., self.pdf), all the other functions needed
to create the CDFs and charts were immediately applicable. Also, by
showing how to model uniform, linear, quadratic, and Gaussian
distributions, not only do we illustrate how powerful the overall method
was, but we provide a useful set of templates to build upon.
In the next chapter, we will tackle a problem taught in introductory
college physics courses – projectile motion – but applied to very high
velocity projectiles where aerodynamic drag changes the acceleration
continuously and results in actual performance being considerably less than
the theoretical.
OceanofPDF.com
Chapter 7
Projectiles – The German 88
DOI: 10.1201/9781003600046-7
x = vx t
(7.4)
and in the y-direction, we have:
y = vy t + ½ (−g) t 2
(7.5)
v(y) = vy + (−g) t
(7.6)
v(y)
2
= vy
2
+ 2 (−g) y (7.7)
How high can the projectile go? At the high point v(y) = 0 m/s, and Eq.
7.7 yields:
2
vy (7.8)
H =
2g
This makes sense: The faster the initial vertical velocity, the higher it
should go, with gravity trying to reduce the effect. How long will the
projective be airborne? The Time in Flight (T) is found from when the
projectile is back on the (y = 0) and Equation 7.5 yields:
vy + ½ (−g) T = 0, (7.10)
Hence,
2 vy
T =
(7.11)
g
So, the faster the vertical launch speed component, the longer the flight.
How far will the projectile travel (the range, R) before striking the
ground?
R = vxT (7.12)
= Vx2Vy/g
2
= 2V cos (θ) sin (θ)/g
2
= (V /g) sin (2θ)
The German 88
The German 88 (see Figure 7.1) was an 88-mm-bore artillery piece that fired
very fast rounds, which meant they could generally strike targets with faster
impacts and at further distances. Because it was such an effective weapon, it
could be used in an anti-aircraft role as a flak gun or mounted on a tank. The
flak version could also be used as an anti-tank weapon, killing enemy tanks
at ranges where the enemy guns were ineffective. Because its shells were so
fast, they could strike targets like tanks using a very flat trajectory making
them easy to aim, and quickly – there was little need to lead the target. Also,
the high muzzle velocity allowed them to strike at high flying aircraft.
Figure 7.1 The German 8.8 cm Flak 36 gun; one of the most
effective and versatile used in WWII.
While the performance might change depending on the type of shell used,
some typical reported effective performance measures include: If V =
840m/s, H = 9900m, and R = 14860m, when θ is 90 and 45 degrees,
respectively. Although there is some ambiguity as to what constitutes an
effective range or ceiling in a military as compared to our kinematics’ sense,
these numbers will serve as useful references.
What do our equations for H, R, and T produce for the German 88? When
θ = 45 degrees, the range R is 72000m and max height H = 18000m, with a
time in flight T = 121s. When θ is 90 degrees, R is zero of course, H =
36000m, and T is 171s. Clearly, these estimates greatly exceed reported
ones. The obvious explanation is that air resistance (drag) must be playing a
critical role by adding a deceleration. So, if we wish to improve our
estimates, we will need to take drag into account.
We know from aerodynamics that drag depends on speed and also on air
density, and this presents an interesting challenge: How should we estimate
drag since it will depend on both the projectile speed and altitude, and is
changing continuously?
D = ½ C ρ 2
A v (7.13)
aD = [
1
(C/m) A ρ 2 (7.15)
0] f v
2
hence,
2
aD = K f v (7.16)
In this form, K combines all the physical properties of the shell (mass,
cross-sectional area, drag coefficient) and sea-level density, and the
acceleration (really a deceleration since it opposes motion) only depends on
the speed, and f(), the density at a given height as a fraction of sea-level
density. So, if we choose a density profile for the atmosphere by specifying
f(), for any projectile, there will be a K that controls the deceleration. In
other words, if we select K (perhaps through trial and error) to match one of
the observations for a given launch angle, we should be able to reasonably
model the behavior for all other launch angles.
Note that because f() will be smaller than one, and since velocity is large,
v2 can be very large – initially about 8402 for the German 88, K must be a
small number, measured in parts per million.
Unlike Equations 7.4 through 7.8, we need to solve a system where the
acceleration is changing in response to height and speed, so these equations
won’t work. However, if we break down the problem of modelling the
trajectory into time-slices (intervals), we can reasonably assume the
acceleration is constant during each interval, and we could use similar
equations if a deceleration from drag is added to both the x and y motions.
The strategy would be to calculate aD at the beginning of an interval, based
on the velocity and height, and use this to calculate the velocity, position,
and height at the end of the interval; the ending values are then used as the
starting values for the next interval, and the process is repeated for as many
intervals needed to follow the trajectory path.
p = projectile()
As a test of the code, we will assume K=0 for no drag, and the results
should match the theoretical. First an instance of the class is created by p =
projectile(), then p.run_K_list(840,45,[0]) results in:
Figure 7.4 Results for the 45 degree and 90 degree scenarios with
K = 0.00036. Since we are exploring a specific K value, the K list
has only one entry.
The results are quite good. While the theoretical (no-drag, K=0)
calculations were more than 400% wrong (e.g., ranges of 70km vs 15km),
our models are within about 3% of the reported values.
Now that we have a reasonable, functioning model, we can use the
projectile class to explore the effects of drag in more detail. Using our code,
a set of three plots was generated, showing y vs x, v vs t, and v vs x, with a
K value list of [0, 0.000001, 0.00001, .0036, 0.001]. The results are shown
in Figure 7.5. The results (top panel) show that as drag increases, the
trajectory becomes less symmetric, the maximum height reduces and range
decreases, and the descent from maximum height is steeper. The middle and
bottom panels show how the speed changes throughout the flight. For the
German 88 (K=.0036), the velocity starts at more than twice the speed of
sound; becomes subsonic by 20s into its 65s flight; attains a minimum speed
of about 200 m/s at the highest point; and then approaches almost twice the
speed of sound as it falls under the influence of gravity.
Figure 7.5 Model results for 45 degree launch angles. Height vs
Distance (top), Speed vs Time (middle), and Speed vs Distance
(bottom).
We now have a reasonable model for projectile motion with drag present,
and it’s accurate to about 3%. In some ways, we have a surprisingly good
result since we have seen the projectile for the German 88 experiences both
supersonic and sub-sonic modes of flight, and having a single K value that
seems to work for both modes is fascinating. But the K value we found
through trial and error was a compromise trying to simultaneously match
best range for θ = 45 degrees and max height for θ = 90 degrees. We could
improve on this by allowing K to be a linear function of θ, based on a K(45)
= A, and K(90) = B where A and B are determined as before through
matching observations using trial and error. Then, when updating the
acceleration components, we would calculate K by:
K =
B − A
θ + 2 A − B
(7.17)
45
Summary
In this chapter, we took the traditional projectile problem taught in
introductory college physics courses and modified the standard equations of
motion by incorporating the effects of drag. These effects were modeled by
adding additional accelerations for the x and y directions, that depended on
air density and on a catch-all parameter K that was tuned to match
observations. Our results were surprisingly good for a system (the German
88) that includes both subsonic and supersonic flight. Our model computer
code is quite general in application and could be easily applied to other
projectiles – as long as the K parameter is calibrated by having the model
match accepted performance metrics.
In the next chapter, we will again use a time-slicing approach, but this
time we will be using dynamics to explore the problem of launching rockets
into orbit, and why we need multi-stage rockets.
OceanofPDF.com
Chapter 8
Rocket Launches
DOI: 10.1201/9781003600046-8
1. The initial MR, MF, and MP – the rocket, fuel, and payload masses
2. F, the rocket engine thrust
3. The fuel burn time Tb
4. Estimated fuel burn rate: r = MF/T
Launch Simulations
We would like to explore questions like the following:
As we shall see, because there are limitations on how much thrust a rocket
engine can generate, a single stage rocket cannot get much beyond LEO.
For the single stage model, we will use the first-stage engine from a Saturn
V and see how high and how fast it can go. The speed is a critical factor
here, because near the Earth’s surface, the escape velocity is about 11 km/s.
This means, a payload/satellite moving at less than this speed cannot escape
the Earth’s gravitational field. It’s not simply a problem of adding extra
fuel, as this can actually make things worse as more fuel is consumed to
support this additional weight!
With the LEO quest, we consider the configuration for an orbit of about
200 km high and also for a Lunar Mission, where the payload must be
delivered into the proper trajectory at a speed of about 10 km/s. As noted
previously for all missions there will be height and speed constraints, and
the payload and fuel mass must be specified to match them. In addition, we
also wish to explore scenarios where a spacecraft climbs to a temporary
orbit where it accelerates and then departs to a position suitable for a
mission to the Moon. So we will add a parameter (Hmax) to the model so if
the height exceeds Hmax, the acceleration is done in-orbit, that is,
gravitational acceleration is ignored, and no additional height gained.
Our program consists of a Python class (rocket) with the complete code
listed under the Programming Notes, later in the chapter.
Modelling a multi-stage launch consists of getting the rocket stage
parameters, and calculating the height, velocity, acceleration, and mass for
each time step in each stage. Note each stage has a different gross mass,
fuel capacity, and thrust. Text results are printed to the console, and a chart
of the height, speed, and remaining mass is produced. Between stages, the
total mass used to initialize the following stage is decremented by the
burned-out stage’s gross mass. Remember, the total mass for any stage burn
is the sum of the remaining stage’s gross masses and the cargo.
With our rocket class design, we can easily run sequences of launches,
for single or multiple stages, for various payloads, with various parking
orbits. Rocket stages can be modified easily by changing their thrust, engine
and gross (hence fuel) masses, and other multi-stage launches added.
In Figure 8.1, we show the instruction set to use our rocket class to run
two different model types, a single stage rocket based on the Saturn V first
stage engine (run_saturn_V_I) to see how high it can go with a minimal
payload, and then a three-stage launch (run_saturn_V_TLI) to see if our
models show that escape velocity can be reached with a typical Lunar
Landing payload.
Figure 8.1 Using our rocket class, we can run a three-stage launch
for Trans Lunar Injection (cargo 50,000 kg) and also to study the
performance of a single stage main engine (140,000 kg payload).
Cargo is specified, and its influence on end results determined.
Hmax is a parameter used to set the parking orbit height, that is, on
reaching this height, the craft accelerates at that height. For the single-stage
launch model, it only needs to be higher than fuel exhaustion for our
purposes, and setting it to 500km will exceed single-stage limits and ensure
no parking orbit phase is used.
The results of the simulations are printed to the console and shown in
Figure 8.2 where payloads (cargo) of 50000 kg and 1000 kg were used for
the 3- and 1- stage models. Even with such a small payload, the single-stage
rocket can barely make it into LEO. Earth’s gravity is just too strong, and
the thrust/energy provided by current engine technology is insufficient to go
higher. On the other hand, the three-stage rocket model can accelerate the
50,000 kg payload up near the needed escape velocity.
Figure 8.2 Model results from the three-stage and single-stage
models with 50,000kg and 1,000kg payloads, respectively.
Column ‘a’ shows the net acceleration when gravity is included,
while F/m shows the acceleration acting on the remaining total
mass.
The launch profiles are shown in Figures 8.4 and 8.5. In all cases, the
remaining/final mass is a very small fraction of the initial mass which
consisted of the fuel mass consumed and the weight of the stages that were
discarded. Smoother curves would probably result if a phasing in of the
next stage was begun before full exhaustion of the previous.
Figure 8.4 Single stage model. Note the maximum height is very
low and the final velocity is well below that needed to escape the
Earth’s gravity. Mass is the engine mass, fuel remaining, and
payload.
Figure 8.5 Class rocket.
Even though our models are pretty simple, they are successful in
revealing the underlying physics of rocket launches. With current
technology placing limits on engine thrusts, they demonstrate there is no
way a single stage engine can attain escape velocity or go beyond LEO. The
solution is seen to be to use multi-stage rockets where rocket mass can be
reduced by jettisoning engines as their fuel is used up. In many ways, this is
a remarkable result in the sense that since fuel mass is the bulk of the
overall mass, that it should make such a difference to be able to jettison the
engine masses (Figure 8.3).
There are lots of possible experiments students could undertake with
models like these, such as future engine thrust improvements, and perhaps
investigating what g would make escaping LEO impossible. In addition, it
would also be possible to explore drag effects in a manner similar to that
used for our projectile models. And it would also be very easy to add
additional rocket designs/specifications for class projects.
A particularly tricky class of scenarios involves solving a launch design
problem such as finding what fuel mass is required to inject a payload into
an orbit at a particular height and velocity. The challenge here is that the
velocity must match that for the orbit. Students might find they can only get
to a lower orbit at the desired speed, but that by adding additional fuel, they
reach the desired orbit but at the wrong speed – fundamentally because to
gain the additional height, they had to add additional fuel to carry the
additional fuel! In cases like this, there might not be any solution other than
to add additional payload, that is, ballast, and having to iterate between
ballast and fuel load is a fun challenge.
Summary
In this chapter, we used time-slicing to analyze the rocket launch model in
which the changing mass caused by fuel burn could be taken into account.
While there are analytic solutions, it becomes significantly more intractable
if changing gravitational effects with height are also considered. Because
our models are based on very basic physics, they are appropriate for young
students, and their focus can be directed toward investigating various
scenarios with different height and payload constraints. Our models
demonstrated why single stage rockets cannot be used to escape Earth’s
gravitational field, and how this could be done using multi-stage ones. The
models could be easily extended and customized to match the many
possible designs.
OceanofPDF.com
Chapter 9
Building a Star Catalog from an
Image
DOI: 10.1201/9781003600046-9
Read in an image
Identify targets (stars/asteroids)
Estimate the position and fluxes for each target
Create a catalog of the targets found
With catalogs like these, we could compare targets for catalogs created at
different times to see if any targets moved or changed brightness, in support
of asteroid or supernova or variable star studies. In addition, we might
compare them to highly accurate star catalogs in order to assign
astronomical coordinates such as RA and DEC to our targets. Or we could
use target position as input to more detailed photometry functions (beyond
simple flux counts of pixels exceeding a particular threshold) such as
single- and multi-iris photometry. In our case, we developed a class called
‘imcat’ that can analyze an image, generate a catalog, and print out a chart
of the detected objects which can be compared with the input image. Script
test_imcat.py, shown in Figure 9.1 demonstrates how it was applied to an
image of the T CrB region. (T CrB is a recurring nova that was predicted to
go nova around the time of writing, and we imaged it using a smart scope.)
The rendered star sizes in the plotted star chart uses a scaling factor of
4000 (see Figure 9.1 line 14) determined through trial and error, and based
on star fluxes. Smaller stars are achieved using larger scale factors.
The image processed in this example is shown in Figure 9.2 and the
resulting catalog’s chart is shown in Figure 9.3.
Figure 9.2 The jpg image of T CrB (center) used to build the
catalog. The field is about 1.2 x .75 degrees. Note the image label,
which is flagged to be trimmed using the my_source=’Seestar50’
entry at line 15 of Figure 9.1.
Figure 9.3 A chart produced from the image catalog suggests that
the catalog building was quite successful. Here the stars are
circles, whose sizes are scaled to catalog-integrated fluxes.
This code assumes there is a flat field image called flat.jpg in the same
directory. The target image was taken using a Seestar 50 on May 11, 2024.
The Seestar 50 image is used here because smart scopes like it are
becoming very popular. However, their jpg images can have footer text, and
by setting my im_source parameter to Seestar50, we can trim out the image
footer with the text. If not using the label feature with Seestar 50 or an
image without a label just leave im_source empty.
The code in Figure 9.1 is very simple – all the difficult stuff is hidden
away in the imcat class. The detection threshold is set as a multiple of the
average intensity. In this instance we used 2.0 through trial and error (see
Figure 9.1 line 8), and we found that increasing this much more would
result in unacceptably fewer detections, while reducing it closer to 1.0
would produce an overwhelming and unmanageable/unwanted number.
When producing the chart, the function was passed a value to set the
number of stars to use (line 14). Since the catalogs are sorted with the
brightest targets first, in the example only the brightest 300 targets are
drawn.
We will now describe the imcat class in detail.
Class imcat_io
Image I/O is supported by functions in class imcat_io shown in Figure 9.4.
Figure 9.4 Class imcat_io has functions to read in and manipulate
images used by imcat.
If an image flat is present, it can be used to flatten the main image being
cataloged. The jpg flat file is read in by the read_flat function (line 8),
converted into a grayscale (line 11), and normalized so the maximum pixel
intensity is unity (lines 13–14).
The image to be cataloged is read in as a jpg file and converted to
grayscale and saved to self.img (lines 16–21).
Image flattening is done by function apply_flat, by dividing each pixel
in self.img by the same pixel in self.flat which produces a new flattened
image self.imgf.
Because it is useful to work with a subset of an image, function
set_subim can select a sub-image of self.img (line 34) and set appropriately
sized utility images/arrays (self.flat, self.ims, self.imgf). Each pixel in the
self.ims array holds the star number associated with the pixel; this allows us
to identify neighboring stars when looking at a given pixel.
Function get_flattened_image combines the other utility functions into a
single utility to read in, resize, and flatten the target image.
Class imcat_pixels
This class holds the four functions used to examine each pixel and
determine which star it is a member of and is shown in Figure 9.5.
Figure 9.5 Class imcat_pixels.
Because only pixels with intensities greater than the threshold are of
interest, a list (self.pixlist) of those pixels’ coordinates is constructed using
function build_pixlist(), and this list is normally considerably smaller than
the number of pixels in the image. Working with self.pixlist is more
efficient than with self.img.
When finding stars, pixels in self.pixlist are studied to see if they are
isolated from others or are touching one or more stars. To keep track of
what star a pixel is in, self.ims is an array where instead of intensities, at
each [i,j] the assigned star_number is stored.
A star is a dictionary entry (self.star_pix_dict) where the key is the
star_number and the value is a list of pixel coordinates for each pixel in the
star. For example self.star_pix_dict[10] might equal [[1,2],[23,21],
[45,78]] To add a pixel at [46,78] to the star, its coordinates (in brackets) are
appended to the list. The length of the list tells us the star area.
If a pixel is isolated, it is the first pixel of a new star, and a new star is
created in star_pix_dict. If a pixel is touching a star, it is added to the star.
If the pixel touches more than one star, the pixel forms a connecting bridge
between them so those stars are merged, and the pixel added to the resulting
star.
Merging two stars, A and B, simply means merging the coordinate list
from B into that of A – where B has a higher star_number than A, changing
the self.ims pixel values for B over to A, and deleting
self.star_pix_dict[B].
As new stars are detected, new key: Value entries are added to
star_pix_dict and as they are merged, unwanted entries are removed.
Finding stars therefore consists of the following steps:
Class imcat_cat
Once all the stars have been identified and saved in star_pix_dict,
functions in class imcat_cat (shown in Figure 9.6) are used to build the
final star catalog for the image. Output entries for each star consist of its
star_number (k) and the threshold used (T), its position based on its center
of gravity (icg, jcg), the integrated flux (Itot), and area (npix).
Figure 9.6 Class imcat_cat holds functions needed for writing out
and reading in star catalogs.
Class imcat
All the functionality built in to the three supporting classes (imcat_*) are
used by class imcat (see Figure 9.7) to process an image, produce a star
catalog, and generate a chart such as that shown in Figure 9.3, from the
catalog.
Figure 9.7 Class imcat.
Running the imcat class consists of getting a flattened image (line 59),
setting a threshold (line 62), building and writing out the star catalog (lines
64–65), and using the catalog to produce a chart (line 68).
The star chart is produced by plot_catalog_xy_coordinates (line 30),
which reads in the catalog csv file (line 31) – this way, a modified code
could simply work with pre-existing catalogs without having to process an
image.
The plot_xy_catalog_coordinates() function takes a specified number
(N) of rows to control clutter, and the scale_factor used to control star
sizes. It extracts four data columns (line 36) ([i,j] position, star id, and flux)
and uses the [i, j] positions to draw the stars, and the star
fluxes/intensities/brightnesses to set the charted star size using use the
provided scaling parameter (shown as 4000). The scale factor was chosen
through trial and error and a minimum star size is set to 1.
The functions defined in imcat.py are flexible in that they can be easily
modified to suit your purpose, and it would be easy to wrap them in a loop
to iterate over a directory of images. Very limited hardening was done for
the sake of clarity, such as testing for empty return values. For limited scope
projects, you can get away with this, for a while, but any code used for
other than demonstrations should be hardened.
get_flattened_img will flatten self.img and set it to the result. This way,
self.img can either be used in the original form or flattened. The
build_img_catalog function is simply a one-step solution for cataloging an
image, suitable for iterating over a directory of images.
The class also includes a section after the ‘if __name__ ==’ conditional
to demonstrate how to use the class and produce a catalog and its chart
shown in Figure 9.3.
Summary
In this chapter, a Python class was explored to generate a catalog of stars
from an image. A catalog of detected objects can be a very powerful
resource, and the coordinates of entries can be used to track
changes/movement, be cross identified with regular star catalogs, and also
serve as input to photometry tasks, where the simple flux counts are not
good enough. The functions provided can be modified easily for other
applications.
In the next chapter, we will explore how to make photometric
measurements on an image which will require the user to select targets
using a cursor, and selecting backgrounds so background components can
be removed. By using a list of known stars and their magnitudes, the fluxes
can then be estimated.
OceanofPDF.com
Chapter 10
Photometry
Measuring Object Brightness
DOI: 10.1201/9781003600046-10
A star in an image looks like a blob with tapering edges, and astronomers
doing precise photometry, when measuring the brightness of a star, will try
and measure the flux within a certain radius of the star’s center (the ‘iris’ or
‘aperture’). (For our purposes, we will consider flux, pixel values, and pixel
intensities as being the same.) As the radius is increased, a greater fraction
of the star’s light will be included; however, since there is always some
background light and noise, the larger the radius, the greater the false signal
from the background. In general, astronomers will decide on an optimal
radius to use, measure the light within that radius of the star’s center (which
includes both star and background light), and then measure the background
and subtract it from the first measurement.
This can be done in one step, where three circles are specified with the
inner circle used to measure the target, and the annulus formed by the two
outer ones used to estimate the background, which can then be used to
remove the target’s background level. This approach is called multi-aperture
photometry.
An alternate approach is to use single-aperture photometry, where one
circle/aperture is defined. This is centered on the target for a measurement
and then placed nearby, so a background can be estimated.
We will use the simpler single-aperture method since only one aperture
size need to be used, and since we will be doing things interactively, we can
simply choose where to measure the background instead of having to adjust
the radii to exclude nearby stars with the multi-aperture approach.
Because we obviously need to include a function to estimate flux within
a certain radius centered on the cursor position, the code could be modified
to for multi-aperture photometry by defining three concentric apertures with
radii r1 < r2 < r3, measuring the fluxes in each, to yield f1, f2, f3, and then
using f3 – f2 to estimate the background level (or more likely, the
background level per pixel, since the annulus area might be different from
the inner circle’s).
Measurement Strategy
A general strategy for measuring target magnitudes requires measuring the
reference stars so the magnitude calibration can be performed and then the
targets of interest. In most cases, it will be necessary to use matplotlib’s
display and selection tools to zoom in and out, so as to best see the object
being measured. This will not affect the measurements.
1. Find the best iris size by examining the target or a reference star:
a. Enter ‘b’ and make a background measurement
b. Enter ‘s’ and make multiple star measurements of the first
reference star
c. Adjust the iris using +/− and find the best SNR; do not change the
iris for the remainder of the processing.
2. For each reference star:
a. Type ‘s’ and click on the star’s center,
b. Type ‘b’ and click on background,
c. Repeat steps a or b if desired,
d. Save the measurement and move to the next reference star (type
‘n’).
3. Measure the star and background levels for the other reference stars.
After this, the magnitude scale is calibrated.
4. Measure the star and background levels of the targets of interest using
the same ‘process as for the reference stars (step 2) using ‘n’ to save
and move on to the next target.
5. Results appear on the console and can be listed by typing
imc.star_list.
Testing imphot
To test and demonstrate the imphot class, a small program
(test_imphot.py) was created, which uses the T CrB image used by
test_imcat.py of an earlier chapter. The code is shown in Figure 10.1.
Figure 10.1 A short program (test_imphot.py) that uses the
imphot class for single-aperture photometry.
A zoomed-in view of the first reference star is shown in Figure 10.3, with
the star iris (white) and a background iris (blue) while changing the star’s
iris to improve the SNR. Figure 10.4 shows the console output during the
process.
Other Considerations
Photometry is very challenging to do properly, and it is easy to think your
carefully made measurements must necessarily be reasonable and of high
quality. The reality is, most likely, your first results will disappoint when
you compare them with others. There are many possible reasons. A simple
calculation of brightness or magnitude from fluxes that are compared to
reference stars can be too simple. Optical systems and cameras can have
different sensitivities to different wavelengths requiring complex calibration
techniques. Perhaps one of the comparison star has an unknown variability?
Perhaps your sensor is non-linear and some stars are saturating more than
you realize? Perhaps neighboring light pollution is creating an unwanted
gradient in your images? For reasons such as these, it is important to be
realistic and honest in your accuracy estimates. One way to get a sense of
your accuracy would be to measure the magnitudes for a set of stars within
a few magnitudes of your target’s. Then, their magnitudes’ standard
deviation is a reasonable estimate of your measurement error. My
experiments using the imphot software on the 9th mag star T Crb (when it
was anticipated to go nova in 2024) were consistent at the 0.05–0.07 mag
level, using the Seestar 50 jpg images. This is a very reasonable result for a
software solution that is very quick and easy to use, very transparent in its
workings, and therefore very suitable for student use.
Summary
In the code we described here, we provided the user with a functioning
interactive solution to learn from and to perhaps modify for their own
purposes. Most photometry software solutions are very complex, with more
capabilities than might be necessary, so there is a benefit in having a
simpler version – even if not automated – to learn from, to work with. The
beauty of organizing code into a python class structure is the user could
build on the class, if they wanted to develop a custom solution, perhaps
more automated, for their needs. Just as we did for the imcat class, we could
add a simple function to save the measured data in self.star_list to a csv file
if desired. We didn’t do this step since it’s not that difficult and is one less
step to explain.
OceanofPDF.com
Chapter 11
Aligning Images and Finding
Targets
DOI: 10.1201/9781003600046-11
Our brains are superb pattern matching and image processing machines. For
example, look at the two images of the T CrB region taken a day apart,
shown in Figure 11.1. We can readily identify stars in one image
corresponding to those in the other. It all seems so easy; the two images
appear to be a simple shift or translation of each other. Surely there must be
a simple algorithm to map one image onto the other.
Figure 11.1 Two images of T Crb (slightly above the image
centers) taken a day apart. Even though the images have shifted
relative to each other, we can easily pick out matching star
patterns.
Image Features
One approach to matching images is to identify features common to both
and to use these to figure out how to map one image onto the other. While
this seems simple, it’s difficult to find a general solution that is robust,
especially when the images might have significant differences.
Our brains are excellent at pattern matching and at detecting real or
imagined patterns. What is it about a star in one image that makes it
possible for our brains to identify it in another? If it does not have
something intrinsic such as extreme brightness difference or color relative
to others, then it must be the spatial relationships between the star and all
others. This suggests exploring these relationships, which at the simplest,
relates to the distances and directions between stars.
Appendix I shows a simple Python class (spatial) that graphically
emphasizes the spatial relationships each star in an image has with its
neighbors and a short code for running it. It is easily run and was applied to
a T CrB image to produce the plots shown in Figure 11.2. Instead of simply
drawing connecting lines between all possible stars, the plot shows the
connections between each of the stars used and nearby neighbors (actually
up to 6 were plotted to reduce clutter), by drawing short radials (vectors)
where each radial length was scaled to be a third of the separation.
Figure 11.2 Using our class spatial allowed us to identify the
neighbors for each star and draw radials which emphasize each
star’s relationship to its neighbors. The resulting patterns for each
star suggest features based on separations and directions would be
useful for identification purposes.
1. Set the number of stars to be used from each catalog. This was set to 6
for our example. The number of combinations to test roughly scales as
the cube of this number. Since the catalogs are sorted by brightness,
the 6 brightest catalog entries were used from each catalog.
2. Read the brightest targets from the image catalogs into dataframes df1
and df2.
3. Create lists of three-star combinations for each image (combos1 and
combos2), where a combination is a list like [sid1, sid2, sid3], that is,
it consists of the star_id numbers in our catalog files.
4. For each combination from image 1, compare it to every combination
for image 2, by
a. Finding and sorting the lengths of the triangle sides and,
b. Calculating the sum of the absolute values of the differences. For
example, if the first triangle’s sides are 2,3,4, and the second
tringle’s sides were 2.1, 3.2, and 3.9, the sum of the absolute
differences would be (0.1 + .2 + .1) = 0.4.
5. Use the triangles with the smallest sum of differences as the best match
and from these, the translational and rotational shifts can be
determined.
In our approach, we are simply using the lengths of triangle sides for
matching. Implicit in specifying the sides are the associated angles, so we
really are relying on the star radials shown in Figure 11.2, but we are only
using two radials at a time from any star.
The complete code for this solution (class catalign) is described below.
We can see how well the catalign class functions from the small test
program shown in Figure 11.3. It’s very basic; two catalogs are imported
and the align_ims function is used. After the alignment is finished, two
additional dataframes have been created and are available for further study
or plotting. The reference catalog is stored in df1, and the one being
matched is in df2. After the alignment, the linearly shifted version of df2 is
saved as df2s, and the rotationally corrected version of df2s is saved as
df2sr.
In Figure 11.5, the star triplets used to match the catalogs are shown. The
linear shift between the images is found by subtracting the coordinates of
sid 61 (red) from those of sid 52 (blue). The angle offset for a triangle side
is found by comparing its directions for the two catalogs, that is, the
directions from sid 10 to sid 52 with that for sid 16 to sid 61. The rotational
offset is the average from the three sides.
Figure 11.5 The triangle/triplet stars from the df1 (red) and df2
(blue) catalogs used for alignment.
In Figure 11.6, a zoomed in portions of the shifted df2s data and the df1
plots is shown. It just so happened that T CrB was used for the linear shift
(it was one of the six brightest objects in both catalogs), and as the result
shows excellent alignment, but for nearby stars, there is an uncorrected
rotation, and they show up as offset red/blue dot pairs.
Figure 11.6 Close up of the region near T CrB (object 61/52 on
the lower left). After the linear shift, which happened to use T
CrB as the center. T CrB aligned just fine, but other stars show an
uncorrected rotational offset.
Finally, adding in the rotation correction fixes the unaligned stars (see
Figure 11.7), which shows a zoomed-in portion of the region near the
bottom right. The uncorrected and corrected versions for the rotation are
shown on the left and right panels.
Figure 11.7 A close inspection of the region’s bottom right corner
shows the effects of adding the rotation correction (right side) to
the linearly shifted catalog (left).
Important outputs from our alignment efforts are the estimates of the
linear and rotation offsets, and the center of rotation. Knowing these,
equations could be written to map the coordinates from one catalog into the
other’s reference frame. This would allow for a sequence of images to be
analyzed where a target of interest was identified in the reference (first)
image, and then knowing the transformation equations, analyze the
corresponding region in each of the subsequent images. The analysis might
be photometric, such as measuring the intensity centered on the calculated
position.
But there is another approach to using our alignment results; instead of
studying the comparison images based on pixel coordinates values derived
from the transformation equations, use the alignment to identify catalog
entries to match targets. There is an important difference between these two
approaches. In the first, the target’s properties are measured from an
estimated position based on the transformation equations; in the second, by
using transformation equations to identify the target in the comparison
image, the target’s cataloged/measured position can be used for the analysis,
and the catalogued position is probably the more accurate.
For our demonstration we compared two catalogs derived from our
images. There is no reason why one of the catalogs could not be a subset of
a standard star catalog, suitably modified, using astronomical equatorial
coordinates (Right Ascension and Declination). This way target equatorial
coordinates could be found and associated with our catalog targets, and
equatorial coordinates added to all our catalog entries. Once done, images
and catalogs could be accessed based on equatorial coordinates.
Finally, when constructing an overlay chart for two aligned catalogs, like
the properly aligned chart (right pane) in Figure 11.7, each target has two
labels, a red and blue, showing the sids/labels from their respective
catalogs. Any target with only one label is a transient or a moving target.
For example, Figure 11.8 shows the zoomed-out overlay of the aligned
catalogs. There are single label objects along the top and bottom showing
the top objects were not present in the second catalog, probably because of
image center shift. However, if not near the frame edge, single label objects
warrant further inspection since that would be indicative of a transient
object appearing in one catalog and not the other and therefore have only
one label. At fainter levels, the transient might simply be a consequence of
thresholding level, or transparency, but a brighter one could be real. For
visual detection of transients with varying brightness, it might be useful to
draw circles, instead of disks, since a change in brightness would result in
two concentric circles of different sizes being displayed instead of two
same-sized ones.
Figure 11.8 When properly aligned, the catalog objects present as
single stars/targets with two labels. There are a few single label
objects near the center, just below T CrB (object 61/52), for
example the red one labelled 72. Without further study, being
small, it’s likely simply a marginal detection and only appearing
in the first catalog, but it could be a transient.
In general, normal matrix i-j coordinates are used instead of cartesian x-y
coordinates which places the coordinate origin at the top left instead of the
bottom left corner. This means the plots will be rotated. In other chapters,
we avoided this by converting (i,j) to (x,y) using a mapping like: x = j, y =
Nrows – i. This slight overhead is not needed, nor used here, for simplicity.
Naming conventions for dataframes follow the processing flow. The
reference image is dfA or df1, and the one being matched is dfB or df2;
applying a linear shift to df2 produces a new dataframe named df2s; adding
a rotation to df2s produces df2sr.
Catalog and dataframe rows contain a label called ‘sid’ – short for
star_id. Generally, stars are referenced using their sid instead of their
dataframe row.
The class breaks down into five functional groups: Initialization and
Input; top level functions to implement the alignment and find matching
star triplets; utility functions for manipulating/selecting pixels based on
their (i,j) coordinates; finding and applying linear shift and rotation
corrections; and output plotting.
When reading in a catalog from a csv file, only the first N rows are read.
With different catalogs, subsets could be selected after reading them in, and
then sorting by brightness or size perhaps.
Function align_ims compares the dataframes for the catalogs. It is
important to specify how many stars to use for matching – which is not the
same as the numbers of stars to read in from the catalogs – since these are
the stars from which the triplet combinations are created, and cross-
compared to find the best match. Because the number of combinations
scales as the third-power of the number to use for matching, only a small
number should be used. align_ims finds the best matching triangles and
uses them to get and apply the linear and rotation corrections.
Finding the best match is done by the find_best_match_triangle
function. It uses the itertool library to build combinations (triplets) of sids
for each catalog and then tests them against each other to find the triangles
where the sum of the sides best agrees. When finished, it returns the
triangles to be used for determining the liner and rotational offsets.
When align_ims() finishes, it returns dataframes for the shifted and
rotated comparison image, and the star triangles used to make the
mappings.
Finding the best matching triplet consists of testing all combinations for
image 1 against those for image 2, and a list (triplet) stores each triangle
pair (one from each catalog), at line 50, and also a list (diff_list) of the
calculated differences (line 51). The best match triangles are found by using
the index of the smallest difference to locate the corresponding triangle pair
in the triplets list.
To support data manipulation and selection, there is a group of utility
functions needed to build dataframes derived from others, and for extracting
values from dataframes based on sid (get_sid_ijF) or data column labels
(get_ijsF_df_cols), and to get the distance between points (get_ij_sep).
Because processing creates new dataframes based on manipulating previous
ones, deep copies are needed to protect data integrity and ensure the derived
dataframes are independent of the originals and not just links.
Once a matching pair of triangles is found, they are used to derive the
linear and rotational shifts needed to bring the catalogs into alignment.
get_triangle_sides_from_sids takes the sids that make up a triangle and
returns a list of the triangle side lengths, normalized so the smallest is 1.
The normalization is included in case catalogs if images of differing plate-
scale were being compared and not strictly needed here.
Note, when comparing images with different plate-scales, the scale factor
could be reasonably estimated from the ratio of the actual lengths (borders)
of the triangles, and then used later as an additional correction (i.e.,
knowing the plate scale would be needed for creating plots showing targets
from both catalogs), but catalog matching does not require it.
The linear (translation) offset between the images is found
(get_lin_shift) by subtracting the coordinates of the first point of the
matching triangles from each other (lines 102–103) and applied
(apply_lin_shift_BtoA) to df2 to create df2s (line 26). And similarly for
rotations where, since the triangles now have one overlapping point, the
orientation of a side can be compared between the triangles to find their
directional discrepancy, which is used to map df2s onto df2sr by applying a
rotation correction (line 29).
Finally, for demonstration and testing, there is a group of plotting
functions. Function plot_coordinates will plot a dataframe’s star positions,
plot_sid_list will plot a list of sids, and plot_AB_coordinates will plot the
coordinates from two dataframes on the one chart. For convenience, the
plot_triangles will add the points used by the matching triangles. The
actual size of the drawn star is based on its brightness, with a minimum of
1.
Summary
Image alignment (‘plate solving’) has become an essential feature offered
by most modern astrophotography systems, and is essential for smartscopes
that highly automate and simplify telescope imaging. It was long
recognized as an extremely difficult problem and only solved in recent
decades through innovative pattern matching techniques that had to be
robust, insensitive, to scale and rotation changes, exposure and color filter
differences, image center offset differences and transients causing, most of
which meant the task of comparing targets in different often involved image
catalog collection with different entries.
In this chapter, we demonstrated how a slightly simpler implementation
of the quad-based method worked and found it could match up target in one
image with another. Our code could be used to follow a target at a fixed
relative position, such as a variable star or nova, across a sequence of
images in order to follow its light curve. On the other hand, unmatched
targets could be explored to see if they were transients or moving
(asteroids/comets). In either case, it is important for students who might
rely on such software to have some appreciation of how it works, and there
are many possible experiments possible where other modification to quad
technique could be explored and additional scripts to support automating
image collections developed.
OceanofPDF.com
Chapter 12
The Saha Equation and the Balmer
Spectrum
DOI: 10.1201/9781003600046-12
What makes the lines unique to an atom? The answer is atoms have
positively charged protons in their nuclei that are surrounded by negatively
charged electrons that occupy various electron levels. Atoms of different
elements have different numbers of protons, and the complex interaction
between the many charged particles will cause available electron-level
energies to vary from one element to the next. But electrons can only orbit
in particular levels; they can move between levels by emitting or absorbing
particular energies, which is why an atom can emit or absorb (block)
particular wavelengths of light. Temperature plays a role because it
determines how fast atoms are moving and how hard they bump into each
other, causing electrons to jump to higher levels or break free altogether.
Normally, electrons will fall back to a lower level if they can, within
millionths of a second, emit a photon of light in the process with an energy
matching the energy loss in moving to a lower energy state, resulting in
emission spectra.
So, there can be very complex interactions at play; collisions might cause
electrons to move to higher levels or even break free. Electrons can fall
from higher levels into many successive lower ones; free electrons can be
captured into a level and then cascade down through lower ones to get to
the lowest available one. And atoms can become ionized, where electrons
are stripped away.
This means that not only does the structure of an atom influence the
photon energy transactions, but the amount of transactions between various
levels is dictated by the temperature which controls which levels are active.
At extremely cold temperatures, no visible light might be emitted at all
since atoms can be moving so slowly, collisions are too weak to bump
electrons to higher levels.
Our goal in this chapter is to model the simplest and most numerous
element in the universe, hydrogen, which has one proton and one electron,
and to show how its spectral lines change intensity with temperature, as the
roles of different energy states change with changing temperature.
Modelling Spectral Lines
Light emissions from a gas of hydrogen atoms will depend on its
temperature, since temperature, ultimately, is simply a measure of how fast
atoms and molecules move – this is why there is an absolute zero
temperature (−273 degrees Celsius) since once particles stop moving, they
can’t move any slower. The hotter a gas, the faster the particles move, and
the harder the impacts between atoms that drive electrons to higher levels
(and even free) where they can fall back to lower ones and release photons
of different wavelengths and energies, that show up as an emission
spectrum such as that shown in Figure 12.1. (Note the spectrum in Figure
12.1 is only the pattern of lines in the visible part of the spectrum; with
other temperatures, other wavelength patterns can exist, outside what we
can see with the naked eye.)
To model these complex interactions for the hydrogen atom, we need to
be able to link the atom’s structure which dictates the possible wavelengths
that can be emitted/absorbed, to the actual distribution of electrons among
possible levels, since the numbers moving between levels generate the
actual emissions. However, the number of atoms that can participate in
emitting light can change with temperature – some fraction will have their
electrons totally stripped; at very high temperatures, most hydrogen atoms
will be fully ionized. (Note: It is beyond the scope of this book to derive the
equations needed for this modelling so we will simply present them and
show how they work.) The Balmer spectrum is seen when electrons fall
from higher levels down to the second level, and a Lyman spectrum when
they fall down into the first (ground) level.
The Saha equation tells us the ratio between the numbers of atoms in
consecutive ionization states. For multi-electron atoms, there can be many
possible states as electrons are stripped one by one from the atom. For
hydrogen, only two states are possible: Neutral, and fully ionized, with the
numbers in each being referred to as NI and NII.
The Saha equation, using the atom’s ionization energy (χ, the energy
needed to strip the electron from the atom), gives us a relationship between
the different ionization states, using typical fundamental constants (e.g.,
mass of the electron me, Boltzmann constant k, and Plank’s constant h)
3/2
Ni+1 2kT Zi+1 2 π me kT
−
X (12.1)
= ( ) e kT
2
Ni PeZi h
Ni+1
=
C
T
2.5
e
−
X
kT
(12.2a)
Ni Pe
Ni+1 0.033298
= T
2.5
e
−157878
T
(12.2.b)
Ni Pe
Using Equation 12.2, with 13.6 eV for hydrogen’s ionization energy, then
for a given temperature, and assumed Pe, we can now calculate NII/NI and
then f from Equation 12.3.
We will use a notation where we will refer to the number of atoms in a
particular state k (i.e., with electrons at level k) as nk. An emitted spectral
line results from electrons falling to a lower energy level and releasing the
energy difference in the form of a photon with that same energy, and that
energy will determine the photon’s wavelength and frequency. Because
levels are unevenly spaced where energy is concerned, the transitions
between them will involve unique energies and wavelengths. For example,
the intensity of a line for a transition from level 5 into level 2 will depend
on how many atoms are in the level 5 energy state and so we would need to
know what n5 was.
We still need to know the atomic energy states, so we can estimate
intensities. Boltzmann statistics show the relative numbers of atoms at
energy states ‘b’ compared to energy states ‘a’ (see Equation 12.4). It
depends on the temperature T and the energies of the individual states. The
terms ga and gb are simply the capacities (degeneracies) of the states – how
many electrons can exist at a given state. For hydrogen, g takes on the
values 2n2 (n = 1, 2, 3…) so its levels, starting from the lowest, hold 2, 8,
18 electrons and so on. Note, if T is very small, the exponent term goes to
zero, while if T becomes very large, the exponent term becomes 1.
Equation 12.4 gives a ratio of populations needed by our models since we
will assume the intensity is proportional to the population size.
B(a, b) =
nb
=
gb
e
−(E −Ea)
b
kT
(12.4)
na ga
If we assume most of the un-ionized atoms are in the first and second
states so nTot = n1 + n2, then it follows that
nk
nk nk n1 B(1, k) (12.5)
= = =
n2
nT ot n1 + n2 1 + 1 + B(1, 2)
n1
Equation 12.5 is in a useful form for us, since we can now express
energy-level populations in terms of the Boltzmann function. We could
simply set the total count to unity and then we would have population sizes
as a fraction of the total. Or, we could simply use the total count as a scaling
factor to control chart scales.
We are now ready to estimate our Balmer and Lyman spectral line
intensities by:
Our models will rely on two main classes, saha and balmer, and a utility
found online that was wrapped in a class wrapper for converting
wavelength values into (R,G,B,A) colors. For completeness, the wavelength
to color class is shown in Appendix II.
The saha class is shown in Figure 12.3 and is very straightforward. If the
class is run by itself, the stub at lines 61–64 will produce a plot like that in
Figure 12.2.
Figure 12.3 Class saha.
To adjust the model change the ZI and ZII parameters with the
get_partition_Zs function and set the electron pressure and the partition
numbers (lines 16 and 25). To support plotting with the
plot_NII_fractions() function, lists of the ionization ratios self.NN and of
the un-ionized fractions for different temperatures are created by the
get_NN_ratios and get_NII_fractions functions. The ionization ratio for a
given temperature can be calculated using the get_NN() function.
The test_balmer.py program will create plots for line intensities and
synthetic spectra similar to those shown in Figures 12.5 and 12.6.
Figure 12.5 Our model’s Balmer line intensities for four different
temperatures.
Figure 12.6 The corresponding synthesized spectra for the results
in Figure 12.5.
The balmer class uses both the saha and w2rgb classes and is intended
to produce charts showing the line intensities and synthetic spectra for a
selection of temperatures and a specified electron pressure. The synthetic
spectra use a fixed line width, and to emulate intensity variation, without
changing the color, the RGB triplet component values for a line’s
wavelength are each multiplied by the ratio of the initial line intensity to the
series maximum for that temperature. As described earlier, line intensity is
estimated from the population at the upper level n1, multiplied by the
fraction of neutral atoms present.
Note, in the code, we use a naming convention where emission lines
occur between a higher level ‘k,’ and a lower one ‘j,’ where for the Balmer
series, j = 2. If other spectral series are of interest, j should be changed, for
example, for Lyman, use j = 1.
Class balmer consists of three main function groups. The first group is
for initializing the class, and also provides some utility function for
calculating the wavelength (get_lambda) associated with an electron
falling from level n1 to level n0 in the hydrogen atom.
Because Python indexing starts from zero, but the usual labelling in
physics for hydrogen levels starts with one, the get_E() and get_g()
functions allow us to use the physics indexing with the energy E and
degeneracy g lists.
The second group of functions use the Boltzmann and related equations
to calculate the population density ratio (self.get_nn), the number of atoms
in state k (self.get_n_k), the intensity of a transition from k down to j
(self.get_single_line_I), and a function to generate the line intensities for
the Balmer sequence (self.get_balmer_lines).
And finally, there are two plotting functions to create line intensity charts
and synthetic spectra.
OceanofPDF.com
Chapter 13
Isochrons
The Ages of Rocks
DOI: 10.1201/9781003600046-13
P = P0e τ
−λt
= P0e (13.1a)
ΔP
= (e
λt
− 1)
(13.1c)
P (t)
1 τ
(13.2)
P = P0 ( )
2
D = P0– P + D0
(13.3)
where D0 is the initial amount present at the mineral formation. If D0 is
zero (i.e., there was no daughter material present when the rock formed,)
then by measuring D and P, we could figure out the sample age using either
Equation 13.1 or 13.2. These equations allow t to be expressed in terms of 3
variables (P, P0, and τ), so measuring the number of D and P atoms, from
which P0 could be estimated, means all three variables would be known,
assuming τ is already known from lab experiments. However, assuming the
sample had no daughter material present initially is not necessarily true,
which means our age determination task has to deal with four variables, not
three, we need something else, and this is where the isochron technique can
be used.
In this chapter, we will explain and model the isochron technique which
is designed to deal with daughter atoms in the initial sample and show how
it can be used to estimate ages of rocks.
Estimating the Age of Rocks using
Isochrons
The isochron method relies on analyzing amounts of different atom types
and it is important we have a clear notation to minimize confusion. There
are three kinds of atoms we will track in our analysis. First there is the
parent type, P, and this radioactively decays into a daughter type. For the
isochron techniques, there must be two daughter isotopes; R is the
radiogenic one, where P decays into R, and I is an inert isotope of D. R and
I are indistinguishable chemically, and in any mineral, there will be a
certain number of daughters for every parent, and this ratio will probably be
different for a different mineral.
We will assume there are no processes that can change I, and so the
amount of I in a rock or a sample remains unchanged after formation. When
thinking in terms of samples from a rock, we are thinking of crystals of
different minerals, that once set, lock in the atoms, so there is no change
from atoms entering or leaving the crystal.
Let’s now suppose there is a radioactive element (P, the parent) that
decays to become a daughter element R. A mineral might have a particular
ratio of P to D atoms; it doesn’t care about the proportions of R and I –
only that their sum is in the correct proportion to P. (To keep things simple,
we will use ‘P’ to refer to the atom type or to the number of those atoms,
and similarly for R and I.)
Minerals/crystals are the result of chemical processes, and, as we said, do
not care about the isotope differences. For example, each 100 atoms of a
mineral might need 20 P and 10 D types of atoms, and it doesn’t matter if
the daughter atoms consist of 8 or 3, or whatever, of the radiogenic, as long
as R and I add up to 10.
A different mineral, for every 100 atoms, might need 40 D and 8 P atom
types, and again, it doesn’t matter how much of R and I is in the 40 D
atoms, just as long as they add up to 40, for that mineral.
Now, let’s suppose a rock is forming from a molten mass, and during its
formation, crystals of both minerals are forming, and will become crystals
in the rock once it has formed. Since both types of crystals formed from the
same initial molten mass, we can reasonably assume both minerals will
have the same ratios for their initial R and I isotopes.
For example, let’s assume R and I are equally likely in the molten mass
and a rock forms with crystals of both mineral types. Then, for every 100-
atom sample from the first mineral, there might be 20 P, 5 R, and 5 I, and
for the second mineral 8 P, 20 R, and 20 I because at t = 0, when the rock
formed, R = I. However, as time passes, and parent atoms turn into R
atoms, after one half-life, the crystal sample compositions will have (10 P,
15 R, 5 I) and (4 P, 24 R, 20 I) respectively, so the R to I ratio changes
with age, and is different, for each mineral. This difference in behavior
represents a new aspect that can be measured, that is used as a ‘known’ by
the isochron technique, to correct for the ‘unknown’ initial daughter
quantity.
If we could monitor a crystal sample over time, P and R would change,
but I would remain constant, and all would depend on the sample size. To
exclude the effects of sample size, let’s normalize all the P and R
measurement, by scaling them to I, in other words, we will consider the
quantities P/I and R/I (as determined for each different element) as being of
primary interest. Note also that when determining the amounts of atoms
present using mass spectroscopy, it is easier to compare two types than to
get estimates of their individual quantities, which is another very important
reason to work with these ratios.
For our rock, we can plot our measurement on a chart where the X-axis is
P/I and the Y-axis is R/I. There will be a (P/I, R/I) data point from each
mineral in the rock. If, for our example, we plotted two points, one from
each mineral, we would find the line through them, called an isochron,
would intercept the Y-axis at a particular R/I value. As we shall
demonstrate, all the isochrons, for all t values studied, will all intersect the
Y-axis at the same value, and this must be the initial (t = 0) R/I value since
that is the only R/I value they all have in common. Knowing this, we can
now say what the initial R was, and figure out the elapsed time, and hence
the rock’s age.
We can express the radiogenic daughter count at time t as the sum of the
initial count and the count of those subsequently created from parent atoms
are decaying.
R(t) = R(0) +
ΔP (t)
P (t)
(13.5)
P (t)
R(t)
=
R(0)
+
ΔP (t) P (t)
(13.6)
I I P (t) I
=
R(0)
+ (e
λt
− 1)
P (t)
(13.7)
I I
and an intercept R(0)/I. So now our problem is solved: We now know the
initial count of daughter atoms and can use the slope to estimate the age:
t =
ln (m + 1)
(13.9)
λ
Two sample minerals were defined using the make_crystal() routine (see
lines 11 and 14). The first sample had 1000 parent (P) atoms with 500
daughter (D) atoms since the parent-daughter ratio (PD_ratio) was set to 2
(line 9). The ratio of R to D daughter isotopes was set to RD_ratio = .5 at
line 10, so there were 250 R and I = D−R (250) inert isotopes. In this
model, the P:D ratio is dictated by the mineral’s chemistry at the time of
formation, the while different minerals can have different PD_ratios, all
must have the same RD_ratio.
At any time t, get_crystal_counts_at_time_t() returns the population
counts of the parent and daughter atoms in the specified crystal.
The isochron chart shows that the slope is zero at t = 0, as expected by
setting t = 0 in Equation 13.8. The tabular output is useful for seeing how
the counts of various atoms change among samples and for different times.
For example, the t = 0 isochron (blue) connects the points (R/I, P/I) for
samples: (250/250, 1000/250) for sample 1 on the left end and (333/333,
2000/333) for sample 2 on the right end, or equivalently (4,1) to (6,1). The
y-intercept is 1 which is correct since R/I is 1 at t = 0.
Overall, the model has worked well and has demonstrated important
aspects such as all isochrons having the same intercept, and how the age
can be estimated from their slopes. Figure 13.3 also shows the estimated
ages displayed to the console. In this case, they are extremely accurate since
there is no noise in our models. In reality, there would be uncertainties in
the various measurements and so the isochrons would be less precise and
hence the age estimates.
Summary
The isochron method is a very powerful technique that relies on the
presence of a stable daughter isotope to provide an additional feature that
can be measured. It assumes all minerals, even with daughter atoms present,
must have a fixed daughter isotope ratio when the mineral was formed. This
ratio changes as minerals age, and parent atoms decay into daughters, so
even though different minerals will have different parent to daughter ratios,
even at t = 0, all minerals share the property that their daughter isotope ratio
is the same for all initially.
In this chapter, we provided a simple code to model the technique and
found the estimated ages derived from the isochron method did indeed
match the model ages.
Model improvements that could be explored would include using
experimentally determined isotope ratios and atom counts, and perhaps
adding a noise component to the simulation to study the technique’s
robustness; for example, beyond what age do slope estimates become
unacceptably imprecise? It would also be easy (desirable) to add additional
crystals/minerals to the sample set, to produce isochrons with more than
two points!
OceanofPDF.com
Appendices
Appendix I: Class spatial
Class Spatial Programming Notes
Class spatial was developed to emphasize the spatial relationships between
stars by specifying a radius, identifying all neighbors within that radius for
all stars, and drawing radials from each star to a fixed number of neighbors.
An example scenario is shown following the usual ‘if __name__ ==’
construct at line 81.
The first 40 stars (the brightest) are read in from the catalog using
read_catalog() on line 89, and get_ijxy() is used to extract matrix and
cartesian versions of the star coordinates (line 91), after which a KDTree
(sp.T) is built (line 92). The KDTree structure creates a balanced tree that
facilitates searching for neighbor.
Function make_plots() creates the two panel output chart and specifies
the maximum number of radials to be drawn for any star (lines 97 and 69).
Radials are plotted by plot_all_radials() which, for every star (line 48),
searches for neighbors within distance r = 300 and plots that star’s radials
(up to Nmax of these) (lines 50–51). Each star’s radials has the same color,
and colors are selected from a list self.pcolors (line 13).
Appendix II : Converting Wavelength to
Color
The w2rgb class converts wavelengths expressed in nm (nanometers) to
color (RGBA) – which supports creation of nice charts and the synthetic
spectra. It uses a function written by Dan Bruton as described in the
comments included with the function.
Since we use the code simply for graphical and visual effect, we accept it
as being suitable for our needs because its results are visually effective and
the code is clear and easy to follow:
OceanofPDF.com
Index
Pages in italics refer to figures.
A
aerodynamic drag, 108
B
Balmer synthetic spectra, 176–177
Boltzman equation, 171
D
DEBCat, 28–31
decay constant, 182
E
electrostatics, 68
event.key, 149
F
flux, 141, 182
H
half-life, 182, 184, 188
L
LEO orbit, 119–120
luminosity, 23
luminosity class, 21
M
magnitude, 20–24
matrix vs cartesian coordinates, 165
P
parallax, 20
particle distributions, 60
gaussian, 93
linear, 100
quadratic, 101–102
uniform, 98
plots
circle, 147
figsize, 17
hspace, 17
legend, 17
log-log, 15
subplots, 43, 97
suptitle, 17
tight-layout, 17
wspace, 17
Python data structures
arrays, 5–7
dataframes, 7–9
lists, 5–7
Python utilities
csv.reader, 147
df.copy, 163
df.tolist, 7
enumerate, 163
fourier transforms, 36
iloc, 8
itertool, 166
KDTree, 154, 192
np.zeros, 7
polyfit, 48–52
range, 18
read_csv, 163
re.findall, 33
R
radiogenic daughter isotope, 183
rotation curves, 76
S
Saha equation, 171
SNR, 142, 144–145, 150
spectral class, 22
spiral galaxy model, 78
super().__init__() 4
T
time slice, 109, 118
W
weather data, 46
Y
Yale Bright Star Catalog, 18–22
OceanofPDF.com