Phys II Eng Full
Phys II Eng Full
Phys II Eng Full
Contents
1 Electrostatic phenomena - Gy orgy H ars 1.1 Fundamental experimental phenomena . . . . . . . . . . . 1.2 The electric eld . . . . . . . . . . . . . . . . . . . . . . . 1.3 The ux . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Gausss law . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Point charges and the Coulombs law . . . . . . . . . . . . 1.6 Conservative force eld . . . . . . . . . . . . . . . . . . . . 1.7 Voltage and potential . . . . . . . . . . . . . . . . . . . . . 1.8 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Spherical structures . . . . . . . . . . . . . . . . . . . . . . 1.9.1 Metal sphere . . . . . . . . . . . . . . . . . . . . . 1.9.2 Sphere with uniform space charge density . . . . . 1.10 Cylindrical structures . . . . . . . . . . . . . . . . . . . . . 1.10.1 Innite metal cylinder . . . . . . . . . . . . . . . . 1.10.2 Innite cylinder with uniform space charge density 1.11 Innite parallel plate with uniform surface charge density . 1.12 Capacitors . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.12.1 Cylindrical capacitor . . . . . . . . . . . . . . . . . 1.12.2 Spherical capacitor . . . . . . . . . . . . . . . . . . 1.13 Principle of superposition . . . . . . . . . . . . . . . . . . 2 Dielectric materials - Gy orgy H ars 2.1 The electric dipole . . . . . . . . . . . . 2.2 Polarization . . . . . . . . . . . . . . . . 2.3 Dielectric displacement . . . . . . . . . . 2.4 Electric permittivity (dielectric constant) 2.5 Gausss law and the dielectric material . 2.6 Inhomogeneous dielectric materials . . . 2.7 Demonstration examples . . . . . . . . . 2.7.1 . . . . . . . . . . . . . . . . . . . 2.7.2 . . . . . . . . . . . . . . . . . . . 1 5 5 6 7 8 9 10 11 12 13 13 15 17 17 19 22 23 24 26 27 29 29 30 32 34 34 35 37 37 41
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
2.8
Energy relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 Energy stored in the capacitor . . . . . . . . . . . . . . . . . . . . 2.8.2 Principle of the virtual work . . . . . . . . . . . . . . . . . . . . .
41 41 44 46 46 47 48 49 49 51 51 52 53 55 56 59 60 61 62 63 65 66 67 68 69 69 71 73 74 77 79 82 82 82 85 86 88
3 Stationary electric current (direct current) - Gy orgy H ars 3.1 Denition of Ampere . . . . . . . . . . . . . . . . . . . . . . . 3.2 Current density (j) . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Ohms law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Joules law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Microphysical interpretation . . . . . . . . . . . . . . . . . . . 4 Magnetic phenomena in space - Gy orgy H ars 4.1 The vector of magnetic induction (B) . . . . . . . . . . 4.2 The Lorentz force . . . . . . . . . . . . . . . . . . . . . 4.2.1 Cyclotron frequency . . . . . . . . . . . . . . . 4.2.2 The Hall eect . . . . . . . . . . . . . . . . . . 4.3 Magnetic dipole . . . . . . . . . . . . . . . . . . . . . . 4.4 Earth as a magnetic dipole . . . . . . . . . . . . . . . . 4.5 Biot-Savart law . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Magnetic eld of the straight current . . . . . . 4.5.2 Central magnetic eld of the polygon and of the 4.6 Amperes law . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Thick rod with uniform current density . . . . . 4.6.2 Solenoid . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Toroidal coil . . . . . . . . . . . . . . . . . . . . 4.7 Magnetic ux . . . . . . . . . . . . . . . . . . . . . . . 5 Magnetic eld and the materials - Gy orgy 5.1 Three basic types of magnetic behavior . . 5.2 Solenoid coil with iron core . . . . . . . . . 5.3 Amperes law and the magnetic material . 5.4 Inhomogeneous magnetic material . . . . . 5.5 Demonstration example . . . . . . . . . . 5.6 Solenoid with iron core . . . . . . . . . . . H ars . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . circle . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
6 Time dependent electromagnetic eld - Gy orgy H ars 6.1 Motion related electromagnetic induction . . . . . . . . 6.1.1 Plane generator (DC voltage) . . . . . . . . . . 6.1.2 Rotating frame generator (AC voltage) . . . . . 6.1.3 Eddy currents . . . . . . . . . . . . . . . . . . . 6.2 Electromagnetic induction at rest . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
6.3
6.2.1 The mutual and the self induction 6.2.2 Induced voltage of a current loop 6.2.3 The transformer . . . . . . . . . . 6.2.4 Energy stored in the coil . . . . . The Maxwell equations . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
88 89 90 95 96 98 98 101 103 104 104 105 108 110 113 113 114 119 122 126 129 129 132 134 139 144 150 150 154 157 161 162
7 Electromagnetic oscillations and waves - G abor Dobos 7.1 Electrical oscillators . . . . . . . . . . . . . . . . . . . . . 7.2 Electromagnetic waves in perfect vacuum . . . . . . . . . 7.3 Electromagnetic waves in non-conductive media . . . . . 7.4 Direction of the E and B elds . . . . . . . . . . . . . . 7.5 Pointing Vector . . . . . . . . . . . . . . . . . . . . . . . 7.6 Light-pressure . . . . . . . . . . . . . . . . . . . . . . . . 7.7 Skin depth . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8 Reection and refraction . . . . . . . . . . . . . . . . . . 8 Geometrical Optics - G abor Dobos 8.1 Total internal reection . . . . . . . . . . . 8.2 Spherical Mirror . . . . . . . . . . . . . . . 8.3 Thin spherical lenses . . . . . . . . . . . . 8.4 Projection by spherical lenses and mirrors 8.5 Aberrations . . . . . . . . . . . . . . . . . 9 Wave optics - G abor Dobos 9.1 Youngs double slit experiment . 9.2 Coherence . . . . . . . . . . . . 9.3 Multiple slit diraction . . . . . 9.4 Fraunhofer diraction . . . . . . 9.5 Thin layer interference . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
10 Einsteins Special Theory of Relativity G abor Dobos 10.1 The Aether Hypothesis and The Michelson-Morley Experiment 10.2 Einsteins Special Theory of Relativity . . . . . . . . . . . . . 10.3 Lorentz contraction and time dilatation . . . . . . . . . . . . . 10.4 Velocity addition . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Connection between relativistic and classical physics . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Introduction
Present work is the summary of the lectures held by the author at Budapest University of Technology and Economics. Long verbal explanations are not involved in the text, only some hints which make the reader to recall the lecture. Refer here the book: Alonso/Finn Fundamental University Physics, Volume II where more details can be found. Physical quantities are the product of a measuring number and the physical unit. In contrast to mathematics, the accuracy or in other words the precision is always a secondary parameter of each physical quantity. Accuracy is determined by the number of valuable digits of the measuring number. Because of this 1500 V and 1.5 kV are not equivalent in terms of accuracy. They have 1 V and 100 V absolute errors respectively. The often used term relative error is the ratio of the absolute error over the nominal value. The smaller is the relative error the higher the accuracy of the measurement. When making operations with physical quantities, remember that the result may not be more accurate than the worst of the factors involved. For instance, when dividing 3.2165 V with 2.1 A to nd the resistance of some conductor, the result 1.5316667 ohm is physically incorrect. Correctly it may contain only two valuable digits, just like the current data, so the correct result is 1.5 ohm. The physical quantities are classied as fundamental quantities and derived quantities. The fundamental quantities and their units are dened by standard or in other words etalon. The etalons are stored in relevant institute in Paris. The fundamental quantities are the length, the time and the mass. The corresponding units are meter (m), second (s) and kilogram (kg) respectively. These three fundamental quantities are sucient to build up the mechanics. The derived quantities are all other quantities which are the result of some kind of mathematical operations. To describe electric phenomena the fourth fundamental quantity has been introduced. This is ampere (A) the unit of electric current. This will be used extensively in Physics 2, when dealing with electricity.
Electrostatics deals with the phenomena of electric charges at rest. Electric charges can be generated by rubbing dierent insulating materials with cloth or fur. The device called electroscope is used to detect and roughly measure the electric charge. By rubbing a glass rod and connecting it to the electroscope the device will indicate that charge has been transferred to it. By doing so second time the electroscope will indicate even more charges. Accordingly the same polarity charges are added together and are accumulating on the electroscope. Now replace the glass rod with a plastic rod. If the plastic rod is rubbed and connected to the charged electroscope, the excursion of the electroscope will decrease. This proves that there are two opposite polarity charges in the nature, therefore they neutralize each other. The generated electricity by glass and plastic are considered positive and negative respectively. The unit of the charge is called Coulomb which is not fundamental quantity in System International (SI) so Ampere second (As) is used mostly. Now take a little (roughly 5 mm in diameter) ball of a very light material and hang it on a thread. This test device is able to detect forces by being deected from the vertical. Charge up the ball to positive and approach it with a charged rod. If the rod is positive or negative the force is repulsive or attractive, respectively. This experiment demonstrates that the opposite charges attract the same polarity charges repel each other. Now use neutral test device in the next experiment. Put the ball to close proximity of the rod with charge on it. The originally neutral ball will be attracted. By approaching the rod with the ball even more the ball will suddenly be repelled once mechanically connected. The explanation of this experiment is based on the phenomenon of electrostatic induction (or some say electrostatic inuence). By the eect of the external charge the
neutral ball became a dipole. For the sake of simplicity assume positive charge on the rod. The surface closer to the rod is turned to negative, while the opposite side became positive. The attractive force of the opposite charges is higher (due to the smaller distance) than the repelling force of the other side. So altogether the ball will experience a net attractive force. When the rod connected to the ball it became positively charged and was immediately repelled.
1.2
In the proximity of the charged objects forces are exerted to other charges. The charge under investigation is called the source charge. To map the forces around the source charge a hypothetic positive point-like charge is used which is called the test charge denoted with q . By means of the test charge the force versus position function can be recorded. In terms of mathematics this is a vector-vector function or in other words force eld F(r). Experience shows that the intensity of force is linearly proportional with the test charge. By dividing the force eld with the amount of the test charge one recovers a normalized parameter. This parameter is the electric eld E(r) which is characteristic to electrication state of the space generated solely by the source charge. The unit of electric eld is N/As or much rather V /m. In Cartesian coordinates the vector eld consists of three pieces of three variable functions. F(r) = E(r) = Ex (x, y, z )i + Ey (x, y, z )j + Ez (x, y, z )k q (1.1)
One variable scalar functions y = f (x) are easy to display in Cartesian system as curve. In case of two or three independent variables a scalar eld is generated. This can be displayed like level curves or level surfaces. To display the vector eld requires the concept of force line. Force lines are hypothetic lines with the following criteria:
Tangent of the force line is the direction of the force vector Density of the force lines is proportional with the absolute value (intensity) of the vector.
The positive test charge is repelled by the positive source charge therefore the electric eld E(r) lines are virtually coming out from the positive source charge. One might say that the positive charge is the source of the electric eld lines. (The outcome would be the same by assuming negative test charge, this time the force would be opposite but after division with the negative test charge the direction of the electric eld would revert.) The negative source charge is the drain of the electric eld lines due to symmetry reasons. So the electric eld lines start on the positive charge and end on the negative charge. When both positive and negative charges are present in the space the electric eld lines 6
leaving the positive charges are drained fully or partially by the negative charges. The electric eld lines of the uncompensated positive or negative charge will end or start in the innity, respectively. In case when more source charges are present in the empty space the principle of superposition is valid. Accordingly the electric eld vectors are added together as usual vector addition in physics.
1.3
The ux
To understand the concept of ux we start with a simple example and proceed to the general arrangement. Assume we have a tube with stationary ow of water in which the velocity versus position vector eld v(r) is homogeneous, in other words the velocity vector is constant everywhere. Now take a plane-like frame made of a very thin wire with the area vector A. The area vector by denition is normal to the surface and the absolute value of the vector is the area of the surface. Let us submerge the frame into the owing water. The task is to nd a formula for the amount of water going through the frame. If the area vector is parallel with the velocity (this means that the velocity vector is normal to the surface) the ow rate () through the frame (m3 /s) is simply the product of the area and the velocity. If the angle between the area vector and the velocity is not zero but some other angle, the area vector should be projected to the direction of the velocity. The projection can be carried out by multiplying with the cosine of the angle. So ultimately it can be stated that the ow rate is the dot product of the velocity vector and the area vector. = v A. (1.2)
Remember that the above simple formula is valid in case of homogeneous vector eld and plane-like frame alone. The question is how the above argument can be implemented to the general case where the vector eld is not homogeneous and the frame has a curvy shape. The solution requires subdividing the area to very small mosaics which represent the surface like tiles on a curvy wall. If the mosaics are suciently small (math says they are innitesimal) then the vector eld can be considered homogeneous within the mosaic, and the mosaic itself can be considered plain. So ultimately the above simple dot product can be readily used for the little mosaic. At each mosaic one has to choose a representing value of the velocity vector since the velocity vector changes from place to place. The surface vector also changes from point to point, since the surface is not plane-like any more. Finally the contribution of each mosaic has to be summarized. If the process of subdivision goes to the innity than the summarized value tends to a limit which is called ux, or in terms of mathematics it is called the scalar value surface
v(r) dA
(1.3)
Here S indicates an open surface on which the integration should be carried out. The open surface has a rim and has two sides just like a sheet of paper. In contrast to it, the closed surface does not have any rim and divides the 3D space to internal and external domains just like a ball. In case of open surface the circulation of the rim determines the direction of the area vector like a right hand screw turning. Since at closed surface there is no rim a convention states that the area vector is directed outside direction. Let us nd out how much the above integral would be if a closed surface would be submerged into the ow of water, of course with a penetrable surface. In physical context the fact is clear that on one side of the surface the water ows in and on the other side if ows out. After some consideration one can readily conclude that the overall ux on a closed surface is zero. This statement is true as long as the closed surface does not contain source or drain of the water. If the closed surface contains source then the velocity vectors all point away from the surface, thus the ux will be a positive value equal to the intensity of the source. Plausibly negative result comes out when the drain is contained by the surface. This time the negative value is the intensity of the drain enclosed. The integral to a closed surface is denoted as follows: =
S
v(r)dA
(1.4)
1.4
Gausss law
In section 1.2. the fact has been stated that positive and negative charge are the source and the drain of the electric eld lines, respectively. Combining this with the features ux on a closed surface the conclusion is clear: The ux of the electric eld to a close surface is zero as long as the surface does not contain charge. When it does contain charge the ux will be proportional with the amount of charge enclosed. It the charge is positive or negative the ux will be the same sign value. In terms of formula this is the Gausss law: E(r)dA =
S
Q 0
(1.5)
On the right hand side Q denotes the total charge contained by the surface, vacuum permittivity 0 is a universal constant in nature. (0 = 8.86 1012 As/Vm) 8
We want to use Gausss law for solving problems in which the charge arrangement is given and the distribution of the electric eld is to be found. This law is an integral type law. In general case information is lost by integration. The only case when information is preserved is when the function to be integrated is constant. Therefore there will be three distinct classes of charge arrangements when the Gausss law can be eectively used. These are as follows:
Spherically symmetric Cylindrically symmetric, innite long Plane parallel, innite large
In all other cases the Gausss law is also true in terms of integral, but the local electric eld is impossible to determine. To use the law actually, one needs a closed surface with the same symmetry as that of the charge arrangement. On this surface the angle of electric eld vector is necessarily normal and its intensity is constant. This way the vector integral of ux is majorly simplied to the product of the area and the electric eld intensity.
1.5
Let us use the Gausss law for the case of point charge. Point charge is a model with zero extension and nite (non innitesimal) charge. Accordingly the charge density and the electrostatic energy are innite. Even though, this is a useful model for many charge arrangements which are much larger than the distinct charges themselves. The electric eld is perfectly spherical around the point charge. So surface to be used is obviously sphere. The surface of the sphere is 4r2 . Accordingly Gausss law can be written as follows: 4r2 E = The electric eld can readily be expressed: E= Q 1 2 40 r (1.7) Q 0 (1.6)
The above formula is the electric eld of the point charge which will be used extensively later in this chapter. The exerted force to a q charge can be written: F = 1 Qq 2 40 r 9 (1.8)
This is the Coulombs law which describes the force between point charges. For practical reason it is worth remembering that the value of the constant in Coulombs law is the following: k= Vm 1 = 9 109 40 As (1.9)
1.6
Force eld is a vector-vector function in which the force vector F depends on the position vector r. In terms of mathematics the force eld F(r) is described as follows: F(r) = X (x, y, z )i + Y (x, y, z )j + X (x, y, z )k (1.10)
where i, j, k are the unit vectors of the coordinate system. Take a test charge and move it slowly in the F(r) force eld from position A to position B on two alternative paths.
Figure 1.1: Integration on two paths Let us calculate the amount of work done on each path. The force exerted to the test charge by my hand is just opposite of the force eld -F(r). If it was not the case, the charge would accelerate. The moving is thought to happen quasi-statically without acceleration. 10
(1.11)
path2
(F)dr
path1
W2 =
A
(F)dr
In general case W1 and W2 are not equal. However, in some special cases they may be equal for any two paths. Imagine that our force eld is such, that W1 and W2 are equal. In this case a closed loop path can be made which starts with path 1 and returns to the starting point on path 2. Since the opposite direction passage turns W2 to its negative, ultimately the closed loop path will result in zero. That special force eld where the integral is zero for any closed loop is considered CONSERVATIVE force eld. In formula: F(r)dr = 0 (1.12)
Using the concept of electric eld with the formula F = q E above equation is transformed: E(r)dr = 0 (1.13)
According to the experience the electric eld obeys the law of conservative eld. The integral on any closed loop results zero. That also means that curve integral between any two points is independent of the path and solely depends on the starting and nal point.
1.7
The work done against the force of the electric eld is as follows: (F)dr = q
A
(E)dr
(1.14)
WB = A q
B B UA = A
E(r)dr
(1.15)
The voltage of point B relative to A is given by the formula above. The fact is clear that the voltage is dependent on two points. If the starting point is considered as a reference 11
point for all the integrals, that specic voltage will be dependent on the nal point only. This one parameter voltage is called the potential.
B
UB =
ref
E(r)dr
(1.16)
E(r)dr +
ref
E(r)dr =
ref
E(r)dr
ref
E(r)dr = UB UA (1.17)
U (r) =
ref
E(r, )dr,
(1.18)
The concept of voltage exists always and its value is denite. The value of potential is indenite because it depends on the reference point too. However this is possible to dene a denite potential. The reference point of the integral should be placed to the innity. This can be done in case of physically real objects when the corresponding improper integral is convergent. The physically real object is by denition such an object which virtually shrinks to a point if one departs innite far away. The spherical charge arrangement is the only physically real object among those mentioned above. An innite long cylinder or an innite plan parallel plate are not physically real, since viewed from innite it still looks innite.
1.8
Gradient
The gradient is an operation in vector calculus which generates the electric eld vector from the potential scalar eld. In general case the formula is as follows: gradU (r) = Therefore: E(r) = gradU (r) (1.20) U (r) U (r) U (r) i+ j+ k = E(r) x y z (1.19)
In the special case of spherical cylindrical and plane parallel structures the gradient operation is merely a derivation according to the position variable. E (r) = 12 dU (r) dr (1.21)
1.9
1.9.1
Spherical structures
Metal sphere
Metal sphere with radius R =0.1m contains Q =10 8 As charge. Find the function of the electric eld and the potential as the function of distance from the center and sketch the result. Calculate the values of the electric eld and the potential on the surface of the metal sphere. Determine the capacitance of the metal sphere. Metal contains free electrons therefore electric eld may not exist inside the bulk of the metal. If there was electric eld in the metal the free electrons would move to compensate it to zero very fast. Since there is no electric eld inside the metal the total volume of the metal is equipotential. The vector of the electric eld is always normal (perpendicular) to the metal (equipotential) surface. The proof of this as follows: If there was an angle dierent of ninety degrees then this electric eld vector could be decomposed to normal and tangential components. The tangential component would readily move the electrons until this component gets compensated. In stationary case all the excess charge resides on the surface of the metal. Therefore a hollow metal is equivalent with a bulky metal in terms of electrostatics. The surface charge density and the surface electric eld are proportional to the reciprocal of the curvature radius. E(r)dA =
S
Q 0
(1.22)
Gausss law is used to solve the problem. We pick a virtual point-like balloon and inate it from zero to the innity radius. Inside the metal sphere there is no contained charge in the balloon. 4r2 E = 0 E=0 (1.23) (1.24)
So the electric eld inside the metal sphere is zero. Out of the metal sphere however the contained charge is the amount given in this problem. 4r2 E = The electric eld can be expressed: E (r) = Q 1 2 40 r 13 (1.26) Q 0 (1.25)
On the surface of the metal sphere the electric eld comes out if r = R is substituted to the above function E (R) = 1 108 Q V 2 = 9 109 = 9000 40 R 0.12 m (1.27)
U (r ) =
E (r, )dr, =
r, =r
Q 1 Q ,2 dr, = 40 r 40 1 1 r = Q 1 40 r
1 )dr, = r ,2
(1.28) (1.29)
1 Q 40 r,
=
r, =
Q 40
So briey the potential function out of the metal sphere is as follows: U (r) = Q 1 40 r (1.30)
On the surface of the metal sphere the potential comes out if r = R is substituted to the above function U (R ) = 108 Q 1 = 9 109 = 900V 40 R 0.1 (1.31)
Inside the metal sphere the potential is constant due to the zero electric eld.
Figure 1.2: Metal sphere Electric eld vs. radial position function Potential vs. radial position function
14
An interesting result can be concluded. Let us divide the formula of the potential and the electric eld on the surface. Q 1 40 2 U (R) = R =R E (R) 40 R Q The electric eld on the surface is the ratio of the potential and the radius. E (R ) = U (R ) R (1.33) (1.32)
The result is in perfect agreement with the numerical values. This result is useful when high electric eld is desired. This time ultra sharp needle is used and the needle is hooked up to high potential. By means of this device corona discharge can be generated in air. Capacitance is a general term in physics which means a kind of storage capability. More precisely this is the ratio of some kind of extensive parameter over the corresponding intensive parameter. For instance the heat capacitance is the ratio of the heat energy over the temperature. Similarly the electric capacitance is the ratio of the charge over the generated potential. The unit of the capacitance is As/V which is called Farad (F ) to commemorate the famous scientist Faraday. Farad as a unit is very large therefore pF or F is used mostly. Capacitance denoted with C is a feature of all physically real conductive objects. In contrast to this the capacitor is a device used in the electronics with intentionally high capacitance. U= Q 1 40 R (1.34)
C=
(1.35)
So the capacitance of the metal sphere is proportional to the radius. It is worth remembering that a big sphere of one meter radius has a capacitance of 110 pF. The capacitance of the human body is in the range of some tens of pF.
1.9.2
Uniform space charge density ( = 106 As/m3 ) is contained by a sphere with radius R = 0.1 meter. (The charge density is immobile. Imagine this in the way that wax is melted charged up and let it cool down. The charges are eectively trapped in the wax.) Find the function of the electric eld and the potential as the function of distance from
15
the center and sketch the result. Calculate the value of the electric eld on the surface of the sphere and the value of the potential on the surface and in the center. E(r)dA =
S
Q 0
(1.36)
We pick a virtual point-like balloon and inate it from zero to the innity radius. Inside the charged sphere the Gausss law is as follows: 4r2 E = 4r3 3 0 (1.37)
On the left hand side there is the ux on the right hand side there is the volume of the sphere multiplied with the charge density. Many terms cancel out. E (r) = r (1.38) 30 The result is not surprising. By increasing the radius in the sphere the charge contained grows cubically the surface area increases with the second power so the ratio will be linear. Outside the charged sphere the amount of the charge contained does not grow any more only the surface of the sphere continues to grow with the second power. 4r2 E = 4R3 3 0 (1.39)
R3 E (r) = 30 r2
(1.40)
The two above equations show that the function of the electric eld is continuous, since on the surface of the charged sphere r = R substitution produces the same result. On the surface of the sphere the numerical value of the electric eld can readily be calculated: 106 V E (R) = R= 0.1 = 3762 12 30 3 8, 86 10 m (1.41)
The potential function can be determined by integrating the electric eld. First the external region is integrated:
r r
Uout (r) =
E (r )dr =
r, =r
R3 , dr = 30 r,2 1 1 r = R3 1 30 r
(1.42)
R3 30
1 R3 1 , ) dr = r ,2 30 r,
=
r, =
R3 30
(1.43)
16
So briey the potential function out of the charged sphere is as follows: Uout (r) = R3 1 30 r (1.44)
The surface potential of the sphere is the above function with r = R substitution: 106 102 R2 = U (R) = = 376V 30 3 8, 86 1012 Remember that this value should be added to the integral calculated next. Inside the charged sphere the integral is dierent: r
r Uin (r) = U (R) + UR = U (R) + R
(1.45)
E (r, )dr,
(1.46)
For simplicity reason only the integral in the parenthesis is transformed rst:
r r UR r
=
R
, , r dr = 30 30
R
r ,2 r dr = 30 2
, ,
r . =r r , =R
= 30
r 2 R2 2 2
(1.47)
Altogether:
r Uin (r) = U (R) + UR =
R2 + 30 30
r 2 R2 2 2
30
3R2 r2 2 2
3R2 r2 60 (1.48)
The numerical value of the central potential is given by the above equation at r = 0 substitution. Uin (0) = R2 106 102 = 564V 3R2 = = 60 20 2 8, 86 1012 (1.50)
1.10
1.10.1
Cylindrical structures
Innite metal cylinder
Innite metal cylinder (tube) with radius R =0.1m contains =10 8 As/m 2 surface charge density. Find the function of the electric eld and the potential as the function 17
Figure 1.3: Sphere with uniform charge density Electric eld vs. radial position function Potential vs. radial position function of distance from the center and sketch the result. The reference point of the potential should be the center. Calculate the value of the electric eld and of the potential on the surface of the metal cylinder. E(r)dA =
S
Q 0
(1.51)
Gausss law is used to solve the problem. We pick a virtual line-like tube and inate it from zero to the innity radius. Inside the metal cylinder there is no contained charge in the virtual tube. 2r l E = 0 E=0 So the electric eld inside the metal cylinder is zero. Out of the metal cylinder however the contained charge is as follows. 2r l E = The electric eld can be expressed: E (r) = R 1 0 r (1.55) 2R l 0 (1.54) (1.52) (1.53)
18
On the surface of the metal cylinder the electric eld comes out if r = R is substituted to the above function E (R) = 108 R 1 V = = = 1129 0 R 0 8.86 1012 m (1.56)
U (r ) =
R
E (r, )dr, =
R
R 1 , R , dr = 0 r 0
R
, =r 1 , R r R [ln r, ]r ln dr = r , =R = , r 0 0 R
(1.57) So briey the potential function out of the metal cylinder is as follows: U (r) = R r ln 0 R (1.58)
On the surface of the metal cylinder the potential comes out if r = R is substituted to the above function U (R) = R ln 0 R R = 0V (1.59)
The result is obvious since inside the metal cylinder the potential is constant due to the zero electric eld. Note that the reference point of the potential could not be placed to the innity because the innite long cylinder is not physically real object. Therefore the improper integral is not convergent.
1.10.2
Uniform space charge density ( = 106 As/m3 ) is contained by an innite cylinder with radius R = 0.1 meter. (The charge density is immobile. Imagine this in the way that wax is melted charged up and let it cool down. The charges are eectively trapped in the wax.) Find the function of the electric eld and the potential as the function of distance from the center and sketch the result. Calculate the value of the electric eld and the potential on the surface of the cylinder. The reference point of the potential should be the central line. E(r)dA =
S
Q 0
(1.60)
19
Figure 1.4: Metal cylinder Electric eld vs. radial position function Potential vs. radial position function
We pick a virtual line-like tube and inate the radius from zero to the innity. Inside the charged sphere the Gausss law is as follows: 2r l E = r2 l 0 (1.61)
On the left hand side there is the ux on the right hand side there is the volume of the cylinder multiplied with the charge density. Many terms cancel out. E (r) = r 20 (1.62)
The result is not surprising. By increasing the radius the charge contained grows with the second power, the surface area increases linearly so the ratio will be linear. Outside the charged cylinder the amount of the charge contained does not grow any more only the surface continues to grow linearly. 2r l E = R2 l R2 20 r 0 (1.63)
E (r) =
(1.64)
The two above equations show that the function of the electric eld is continuous, since on the surface of the cylinder r = R substitution produces the same result. 20
On the surface of the sphere the numerical value of the electric eld can readily be calculated: E (R) = 106 V R= 0 . 1 = 5643 20 2 8, 86 1012 m (1.65)
The potential function can be determined by integrating the electric eld. First the internal region is integrated: The reference point of the potential will be the center.
r r r
Uin (r) =
0
E (r )dr =
0
, , r dr = 20 20
0
r ,2 r dr = 20 2
, ,
r , =r r, =0
= 20
r2 2 (1.66)
2 r 40
Uin (r) =
2 r 40
(1.67)
The surface potential of the cylinder is the above function with r = R substitution: Uin (R) = R2 106 102 = = 282V 40 4 8, 86 1012 (1.68)
Remember that this value should be added to the integral calculated next. r
r Uout (r) = Uin (R) + UR = Uin (R) + R
E (r, )dr,
(1.69)
For simplicity reason only the integral in the parenthesis is transformed rst:
r r UR
=
R
R2 R2 , dr = 20 r 20
R
dr, R2 R2 r , r . =r = = [ln r ] ln , r =R , r 20 20 R
(1.70)
Altogether:
r Uout (r) = Uin (R) + UR =
R2 R2 r + ln( ) 40 20 R
R2 r 1 + 2 ln( ) 40 R
(1.71)
21
Figure 1.5: Cylinder with uniform charge density Electric eld vs. radial position function Potential vs. radial position function
1.11
Innite metal plate contains =10 8 As/m 2 surface charge density. Find the function of the electric eld and the potential as the function of distance from the plate and sketch the result. The reference point of the potential should be the center. Calculate the value of the electric eld on the surface of the metal plate. E(r)dA =
S
Q 0
(1.73)
Gausss law is used to solve the problem. Pick a virtual drum with base plate area A. Position the drum with rotational axis normal to the charged plate. The charged plate should cut the drum to two symmetrical parts. E 2A = A 0 (1.74)
The absolute value electric eld can be expressed: E= 108 V = = 564 12 20 2 8.86 10 m 22 (1.75)
The result shows that the electric eld is constant in the half space. The direction of the electric eld is opposite in the two half spaces. In contrast to the spherical and cylindrical structures where the radial distance is the position parameter, here a reference direction line will be used. The potential function can be determined by integrating the electric eld in the positive half space:
x x x
U ( x) =
0
E (x, )dx, =
0
dx, = 20 20
0
dx, =
, x, =x [x ]x, =0 = x 20 20
(1.76)
Obviously the potential function turns to its negative in the negative half space.
Figure 1.6: Innite parallel plate with uniform surface charge density Electric eld vs. radial position function Potential vs. radial position function
Note that the reference point of the potential could not be placed to the innity because the innite plate is not physically real object.
1.12
Capacitors
Capacitors consist of two plates to store charge. The overall contained charge is zero since the charges on the plates are opposite therefore the electric eld is conned to the 23
inner volume of the capacitor. The capacitance is the ratio of the charge over the voltage Three dierent geometries will be treated below. generated between the plates. C = Q U 1.12.1/ Parallel plate capacitor The parallel plate capacitor is made of two parallel metal plates facing each other with the active surface area A. The distance between the plates and the charge are denoted by dand Q, respectively.
Figure 1.7: Parallel plate capacitor There is homogeneous electric eld between the plates, while out of the capacitor there is no electric eld. Use the Gausss law for a drum-like surface which surrounds one of the plates. EA = Q 0 E= Q A0 (1.78)
To nd out the voltage between the plates does not need integration due to the homogenous eld. U = d E U= dQ A0 A d (1.79)
1.12.1
Cylindrical capacitor
The cylindrical capacitor is made of two coaxial metal cylinders. The inner and the outer radii as well as the length are denoted R1 , R2 and l, respectively. The coaxial cable is the only practically used cylindrical capacitor. Let us use the Gausss law. A coaxial cylinder should be inated from R1 to R2 . E(r)dA =
S
Q 0
(1.81)
24
(1.82)
E=
1 Q 20 l r
(1.83)
U =
R2
1 Q Q dr = 20 l r 20 l
R1
Q R2 1 dr = ln( ) r 20 l R1
(1.84)
The above formula shows the obvious fact that the capacitance is proportional to the length of the structure. Because of this the capacitance of one meter coaxial cable is used mostly. This is denoted with cand measured in F/m units. Most coaxial cables represent some 10 pF/m value. c= Q 20 = U ln(R2 /R1 ) (1.86)
25
1.12.2
Spherical capacitor
The spherical capacitor is made of two concentric metal spheres. The inner and the outer radii as well as the charge are denoted R1 , R2 and Q, respectively. Let us use the Gausss law. A concentric sphere should be inated from R1 to R2 . E(r)dA =
S
Q 0
(1.87)
E 4r2 =
Q 0
(1.88)
E=
Q 1 2 40 r
(1.89)
U =
R2
Q 1 Q 2 dr = 40 r 40
R2
1 Q 1 ( 2 )dr = r 40 r
r=R1
=
r=R2
Q 40
1 1 R1 R2
(1.90)
26
40 1 R 2
(1.91)
1.13
Principle of superposition
The Gausss law can only be used eectively in the three symmetry classes mentioned earlier. If the charge arrangement does not belong to any of those classes the principle of superposition is the only choice. This time the charge arrangement is virtually broken to little pieces and the electric elds of these little pieces are superimposed like point charges.
Find the electric eld of a nite long charged lament in the equatorial plane as the function of distance from the lament. The linear charge density is denoted .
Since the lament is not innite long Gausss law can not be used eectively. The charged lament is divided to little innitesimal pieces and the electric elds of such pieces are added together. Due to symmetry reasons only the normal components of the electric eld are integrated since the parallel components cancel out by pairs. The mathematical deduction of the nal formula follows below without close commenting to the transformations. For the denition of the notations refer the gure below: The innitesimal contribution of the electric eld is calculated as a point charge. dE = dQ 1 40 r2 dQ = rd cos r= R cos (1.92)
The innitesimal charge is contained by the innitesimal angle. dE = rd 1 1 1 1 2 = d = d cos 40 r 40 r cos 40 R (1.93)
The electric eld of the point charge is projected to the perpendicular direction. The parallel direction components cancel out by symmetric pairs. dE = dE cos dE = 1 cos d 40 R
(1.94) (1.95)
E=
dE =
1 1 cos d = 40 R 40 R
cos d =
27
Figure 1.10: Principle of superposition The result of the superposition is the formula below which could not have been attained with Gausss law. E= sin 2R0 (1.97)
If approaches ninety degrees the laments tend to the innity when using Gausss law is an option. E = lim E=
2
2R0
(1.98)
Using Gausss law the above result can be reached far easier for the innite long lament. E 2R l = 2R0 l 0 (1.99)
E =
(1.100)
The results are in perfect match. However the point is that superposition principle can be used in full generality, but it is far more meticulous and tedious than using Gausss law if that is possible. 28
2.1
Consider a pair of opposite point charges (+q, q ). Initiate the vector of separation (s) from the negative to the positive point charge. The following product denes the electric dipole moment: p = qs [Asm] (2.1)
In order to generate a point-like dipole the denition is completed with a limit transition. Accordingly the absolute value of the displacement vector shrinks to zero while the charge tends to the innity such a way that the product is a constant vector. The pointlike dipole is a useful model when the distance of the charges is far smaller than the corresponding geometry, for example if a dipole molecule is located in the proximity of centimeter size electrodes. Force couple is exerted to the dipole by homogeneous electric eld. The torque (M) generated turns the dipole parallel to the electric eld.
29
M=pE
Asm
V = Nm m
(2.2)
The dipole moment turns into the direction of the electric eld spontaneously and stays there. Having reached this position, the least amount of potential energy is stored by the dipole. Obviously the most amount of potential energy stored is just in the opposite position. Let us nd out the work needed to turn the dipole from the deepest position to the highest energy.
W =
0
M d =
0
pE sin d = pE
0
(2.3)
According to this result the potential energy of the dipole is as follows: Epot = p E (2.4)
This formula provides the deepest energy at parallel spontaneous position and the highest at anti-parallel position. The zero potential energy is at ninety degrees. The dierence between the highest and lowest is just the work needed to turn it around.
2.2
Polarization
Take a plate capacitor and ll its volume with a dielectric material. The experiment shows that the capacitance increased relative to the empty case. The explanation behind is the polarization of the dielectric material.
30
Figure 2.2: Plate capacitor with dipole chain Due to the electric eld of the metal plates the atomic dipoles have been arranged like the chains as the gure shows above. Inside the dielectric material the electric eect of dipoles cancel out since in any macroscopic volume equal number of positive and negative charges is present. The exceptions are the two sides of the dielectric material where the uncompensated polarization surface charge densities reside. These uncompensated polarization surface charges are opposite in polarity relative to the adjacent metal plates. This way the eective total charge is reduced, thus voltage of the capacitor diminished and ultimately the capacitance is increased. Assume that the insulating material contains n pieces of dipoles per unit volume (1/m 3 ). The surface area and the separation of the plate capacitor is denoted A and d respectively. The total number of dipoles (N ) is as follows: N = Adn (2.5)
The dipoles are located in chains between the metal plates. The number of dipoles in such a chain is the ratio of the distance between the plates (d) and the separation of the dipoles (s). The total number of dipoles (N ) can be expressed if the length of the chain is multiplied with the number of chains (c) present in the material: d N= c s Let us combine these latter two equations: (2.6)
d Adn = c Ans = c (2.7) s The separation (s) can be expressed as ratio of the dipole moment (p) and the charge of the dipole (q ). (In present discussion the absolute values of the quantities are denoted without vector notation.) p s= (2.8) q 31
The total polarization surface charge is the product of the number of chains and the uncompensated opposite charge at the end of each dipole chain. Qp = c(q ) = Anp Finally the polarization surface charge density (p ) needs to be expressed: p = Qp = np = P A As m2 (2.11) (2.10)
Here we introduced the vector of the polarization (P). The sources of this vector are the opposite of the polarization charges. The negative polarity comes from the denition of the electric dipole which points from to minus to the plus in contrast to the direction of the electric eld. (Without vector notation the absolute value is meant). An additional result can also be concluded. The density of dipoles gives rise to the polarization. The formula is valid in three dimensions too. P = np P = np (2.12)
2.3
Dielectric displacement
32
The metal plates of the capacitor contain what are called the free charges. The free charges are mobile and can be conducted away by means of a wire. In contrast to this the polarization charges are immobile. In chapter 1 the homogeneous electric eld in plate capacitor has been expressed, provided the free surface charge densities (+free , -free ) are located on the metal plates. Ef ree = f ree 0 0 Ef ree = f ree (2.13)
Here we introduce the vector of the dielectric displacement (D). The sources of this vector are the free mobile charges. (Without vector notation the absolute value is meant). 0 Ef ree = D = f ree As m2 (2.14)
The polarization surface charges (+p , -p ) perform the same way but in opposite direction: Ep = p 0 0 Ep = p (2.15)
0 Ep = P = p
As m2
(2.16)
The total surface charge density is the sum of the free and the polarization charges: tot = f ree + p (2.17)
The total charge density is the source of the resulting electric eld in the capacitor: 0 E = D P D = 0 E + P (2.18) (2.19)
The last formula has been deduced for one dimensional case. The parameters show up here as they were real numbers. However the result is true in full generality in three dimensions with vectors as well. D = 0 E + P (2.20)
Note the important fact that the electric eld and so the intensity of forces are always reduced in presence of dielectric material.
33
2.4
Experiments show that the polarization of some isotropic material is the monotonous function of the external electric eld. At external elds of higher intensity the insulating material is gradually saturated. At relatively low levels the function can be considered linear. Our next discussion is conned to the linear range. This case the proportionality is holding between the polarization and the external electric eld. In order to make an equation out of the proportionality a coecient () is introduced. P = 0 E (2.21)
The coecient is the permittivity of vacuum (0 ) and the electric susceptibility (). Substitute this equation to the former expression of D vector: D = 0 E + 0 E = 0 (1 + )E = 0 r E Here the relative permittivity (r ) has been introduced: 1 + = r (2.23) (2.22)
Typical values of the relative permittivity are up to ve or so. Very high numbers are technically impossible. Finally the result to be remembered is as follows: D = 0 r E As m2 (2.24)
2.5
In this section the vector calculus will be used at somewhat higher level. The divergence operation (div ) generates a scalar eld which represents the sources of some vector eld. V(r) = Vx (x, y, z )i + Vy (x, y, z )j + Vz (x, y, z )k Vx Vy Vz + + x y z (2.25)
div V(r) =
(2.26)
The Gauss Ostrogradsky theorem integrates the divergence to a volume as follows: V(r)dA =
S V
(div V)dV
(2.27)
34
Let us generate the divergence of the equation discussed earlier in this chapter: D = 0 E + P div D = div (0 E) + div P (2.28) (2.29)
The sources or in other words the divergences are the corresponding volume charge densities (free , tot , p ) in general. Earlier in this chapter the surface charge densities have been discussed in details for the case of the plate capacitor. Accordingly the following relations are plausible: div D = f ree div (0 E) = tot div P = p (2.30)
Gauss Ostrogradsky theorem generates integral form from the relations above: DdA = Qf ree
S S
(0 E)dA = Qtot
S
PdA = Qp
(2.31)
The left hand side integral is the well-known form of Gausss law with D vector. This expresses that the ux of D vector on a closed surface (S ) equals the amount of the contained free charges. The integral in the middle expresses that the ux of 0 E vector equals the total amount of any contained charges. Finally the right hand side states that the ux of the polarization P vector equals the opposite of the polarization charges contained by the S surface.
2.6
Consider two dierent dielectric materials with plane surface. The plane surfaces are connected thus creating an interface between the insulators. This structure is subjected to the experimentation. First the D eld is studied. The interface is contained by a symmetrical disc-like drum with the base area A. The upper and lower surface vectors are A1 and A2 respectively. A1 = A2 |A1 | = |A2 | = A (2.32)
The volume does not contain free charges therefore the ux of the D vector is zero. DdA = D1 A1 + D2 A2 = 0
S
(2.33) (2.34)
D1 A2 = D2 A2 35
Figure 2.4: D eld at the interface of dierent dielectric materials The operation of dot product contains the projection of the D vectors to the direction of A2 vector which is the normal direction to the surface. The subscript n means the absolute value of the normal direction component. D1n A2 = D2n A2 Once we are among real numbers the surface area cancels out readily. D1n = D2n (2.36) (2.35)
According to this result the normal component of D vector is continuous on the interface of dielectric materials. Secondly the electric eld E is the subject of analysis.
The interface is surrounded by a very narrow rectangle-like loop with sections parallel and normal to the surface. The parallel sections of the loop are s and s vectors. The normal direction sections are ignored due to the innitesimal size. The closed loop integral of the E in static electric eld is zero. Edr = sE1 + (s)E2 = 0
g
(2.37)
sE1 = sE2
(2.38)
The operation of dot product contains the projection of the E vectors to the direction of s vector which is the tangential direction to the surface. The subscript t means the absolute value of the tangential direction component. sE1t = sE2t (2.39)
Once we are among real numbers the length of the tangential section cancels out readily. E1t = E2t According to this result the tangential component of the E interface of dielectric materials. (2.40) vector is continuous on the
2.7
2.7.1
Demonstration examples
A metal sphere with radius (R1 = 10cm) contains free charges (Qf ree = 108 As). The metal sphere is surrounded by an insulating layer (r = 3) up to the radius (R2 = 15cm). Find and sketch the radial dependence of D, E and P vectors. Determine the numerical peak values in the break points and nd the amount of the polarization charge. The rst parameter to deal with is the D vector because the normal component is continuous on the interface of dielectric materials. Let us use the Gausss law. DdA = Qf ree
S
(2.41)
37
Figure 2.6: Metal sphere surrounded by insulating layer Inside the metal sphere all the parameters are zero only out of the metal sphere is of interest. 4r2 D = Qf ree Qf ree 1 2 4 r (2.42)
D=
(2.43)
The peak value at the brake point results once r = R1 is substituted. D(R1 ) = Qf ree 1 As 108 2 = 100 = 7.96 108 2 4 R 4 m (2.44)
The 0 E eld is identical with the D function out of the insulator. In the insulator however the 0 E function is reduced to one third, according to r = 3 value.
Figure 2.8: The absolute value of 0 E vs. radial position function The 0 E functions in the insulator and out of the structure are as follows: 0 Ein = Qf ree 1 4r r2 0 Eout = Qf ree 1 2 4 r (2.45)
The brake point peak values of 0 E function are as follows: Qf ree 1 108 1 As 0 Ein (R1 ) = 2 = = 2.65 108 2 2 4r R1 4 3 0.1 m Qf ree 1 108 1 As 2 = = 3.54 108 2 2 4 R2 4 0.15 m (2.46)
0 Eout (R2 ) =
(2.47)
The corresponding electric elds are: Ein (R1 ) = V 2.65 108 = 3000 12 8.86 10 m Eout (R2 ) = 3.54 108 V = 4000 12 8.86 10 m (2.48)
39
The radial function of the P vector is zero except for the insulating material. In the insulating material this is as follows: D = 0 E + P P = D 0 Ein P = Qf ree 1 Qf ree 1 2 = 4 r 4r r2 1 1 r Qf ree 1 2 4 r (2.49) (2.50) (2.51)
Figure 2.9: The absolute value of P vs. radial position function Let us determine the peak values in the break points: P (R1 ) = 1 1 r 1 r Qf ree 1 2 = 4 R1 Qf ree 1 2 = 4 R2 1 1 3 1 3 108 1 As = 5.31 108 2 2 4 0.1 m 108 1 As = 2.36 108 2 2 4 0.15 m (2.52)
P (R2 ) =
(2.53)
(2.54)
4r2 1 1 r
1 r
Qf ree 1 2 = Qp 4 r
(2.55)
Qp = 1
(2.56)
2.7.2
Study the results of the previous demonstration example in that hypothetic case (Case 1) if the relative permittivity tends to the innity. Compare the results with the case (Case 2) when the dielectric material would be replaced with metal. It is interesting to observe that the E eld is identical in both cases. The sources of the electric eld are the total charges so the free and the polarization charges both count. The D eld is dierent since in Case 1 the function did not change but in Case 2 it vanished between the radii. This happened because the sources of the D eld are solely the free charges, so in Case 1 it did not change while in Case 2 it did change due to the free charges generated by metal. The P and the D eld compensate each other so E eld has been reduced to zero between the radii in Case1. In Case 2 P vector is obviously zero in absence of dielectric material.
2.8
Energy relations
Any electrostatic charge arrangement represents potential energy. This energy equals the amount of work needed to create the arrangement.
2.8.1
Consider a capacitor without charges initially. Carry an innitesimal amount of dQ charge from one plate to the other. Therefore voltage (dQ/C ) will appear between the plates. The next packet of dQ charge needs to be carried against the electric eld generated by the previous packets. This way the voltage on the capacitor and the innitesimal amounts of works will increase linearly. The triangle under the graph can represent the work done. The total amount of work can be calculated by integration of those innitesimal contributions.
Q Q
Epot = W =
0
U (Q )dQ =
0
1 Q, , dQ = C C
1 Q,2 Q dQ = C 2
, ,
Q, = Q
=
Q, =0
1 Q2 2C
(2.57)
41
Relative permittivity tends to the innity. Insulating layer is replaced with metal. Case Case 1. 2. Qf ree Qf ree 1 r12 R2 r D = 4 r2 R1 r D= 4
Fig. 2.10 The D eld vs. radial position function Qf ree 0 Eout = 4 r12 R2 r
Fig. 2.11 The D eld vs. radial position function Qf ree 0 Eout = 4 r12 R2 r
Fig. 2.12 The 0 E eld vs. radial position function Qf ree P = 4 r12 R1 r R2
Figure 2.16: Voltage vs. charge function The fundamental formula can be combined into the result. Q = CU Epot = 1 Q2 1 (CU )2 1 = = CU 2 2C 2 C 2 (2.58)
The practical cases use the last formula since the voltage is the known parameter mostly. 2.8.2 Electrostatic energy density A plate capacitor is studied. The following pieces of information are at disposal: U = Ed C = 0 r A d 1 Epot = CU 2 2 (2.59)
The notations are as dened earlier. Let us substitute to the nal formula: 1 1 A 1 Epot = CU 2 = 0 r (Ed)2 = 0 r E 2 (Ad) 2 2 d 2 (2.60)
In the last formula the volume of the plate capacitor emerges. The energy density (epot ) can be calculated as follows: epot = Epot 1 1 1 = 0 r E 2 = E (0 r E ) = ED Ad 2 2 2 J m3 (2.61)
This result is also true in full generality in isotropic insulators. This time the dot product of the vectors is used. 1 epot = ED 2 43 J m3 (2.62)
2.8.2
Electrostatic forces can be determined with the principle of the virtual work, provided the potential energy of a charge arrangement can be expressed as the function of some kind of position coordinate. This time the derivative of the potential energy results the intensity of force. The tedious integration of Coulombs law can be replaced with the calculation of the potential energy, which is far easier task in most cases. Demonstration example Find the pressure exerted to the dielectric material between the two plates of a charged and disconnected plate capacitor.
Figure 2.17: Determination of the electrostatic pressure The following pieces of information are at disposal: F = dEpot dx C (x) = 0 r A x Epot = 1 Q2 2C (2.63)
The notations are as dened earlier. Let us substitute the formula of capacitance to the nal formula: Epot 1 Q2 Q2 1 Q2 x = = = 2C 2 C 2 0 r A 44 (2.64)
Let us make the derivation. The absolute value of the force is as follows: F = Q2 2A0 r (2.65)
Thus between the plates of the capacitor a constant attractive force emerges. This is not surprising since opposite charges are facing each other at a little distance. The following pieces of additional information are at disposal: Q = CU E= U d C = 0 r A d F = Q2 2A0 r (2.66)
The notations are as dened earlier. The pressure is denoted p. Let us substitute to the nal formula: pA = F = Q2 = 2A0 r 0 r A d
2
U2
A 1 = 0 r 2A0 r 2
U d
A 0 r E 2 2
(2.67)
This formula has shown up already in this chapter. The energy density and the pressure to the dielectric material between the plates are expressed by the same formula. The critical electric eld (Ekr ) is the limit at which electric discharge occurs. The manufacturers of the capacitors carefully approach this limit by using tough materials. So the maximum pressure is determined approximately by the above formula at critical electric eld intensity. For estimation purposes let us choose the following values: r =3 and Ecr = 106 V/m. 1 1 2 = 8.86 1012 3 1012 = 13.3P a pmax = 0 r Ecr 2 2 (2.69)
This pressure is an insignicant mechanical load on the dielectric material between the plates.
45
Consider two pieces of metal electrodes on dierent potentials. The voltage between them is the dierence of the potentials. Now connect the electrodes by means of a wire. The experiment proves that electric current ows on the wire as long as the voltage is sustained. The value of the current is the time derivative of the charge transferred. The unit of electric current is Ampere [A] which is a fundamental quantity in the SI system. Therefore the electric charge is a derived quantity and its unit is Ampere second [As] which can be called Coulomb. I= dQ dt (3.1)
Currents in close proximity exert forces to each other. The denition of Ampere is based on the force interaction between two parallel wires which carry the same current. It is worth mention here the important fact that parallel direction currents attract while the opposite direction currents repel each other. This is somewhat in contrary to the anticipation which might suggest otherwise. It is also important to note that the direction of electric current is downhill the potential eld. By denition the direction is from the plus to the minus electrode. And this is always true, no matter what kind of charge carrier is involved. If the charge carrier is negative (mostly electron) then the direction of mechanical ow is just opposite to the current direction. Experiments show that the intensity of force (F ) is proportional to the currents (I ) and to the length (l) of the wire, while it is reversely proportional to the separation (r) of the parallel wires. To create equation from the proportionalities a coecient is
46
introduced (0 /2 ). 0 I 2 l (3.2) 2 r The parameter 0 is a universal constant in nature and this is called the permeability of vacuum. The numerical value is 4 107 Vs/Am. F =
Figure 3.1: Attractive force between parallel currents Based on the formula, the denition of Ampere is as follows: The values are 1A of two identical parallel currents if the attractive force is 2 107 N between them provided both the length of the wire and the separation are one meter. The force to be measured is obviously very small, so much higher current and far smaller separation are used in the real measurements.
3.2
The current density is a more essential physical quantity than the current itself. The current density is a vector. Its direction shows the local direction of the current. The ux of the current density on an open (S ) surface results the actual current owing through the rim of the surface. The measuring unit of the current density is A/m2 . I=
S
jdA
(3.3)
47
At homogeneous current density and plane surface the above integral can be replaced by the dot product of the current density and the corresponding area vector. I =jA (3.4)
If the current density vector and the area vector are parallel then the absolute value of the current density can be expressed from the equation: j= I A (3.5)
3.3
Ohms law
Experiments show that the current (I ) is proportional to the voltage (U ) applied. The coecient between them is the conductance (G) of the conductor. The unit of the conductance is A/V called Siemens. I = GU (3.6)
The reciprocal value of the conductance is called the resistance (R). The unit of the resistance is V/A called Ohm (). R= 1 U = G I (3.7)
The resistance of a cylindrical conductor is proportional to the length (l) and reversely proportional to the cross sectional area (A). The coecient is characteristic to the material of the conductor which is called the resistivity (). Its measuring unit is ohm meter (Vm/A). R= l A (3.8)
The reciprocal of the resistivity is called the conductivity ( ): = Let us substitute to the Ohms law: l U = RI = I A U I = l A 48 (3.10) 1 (3.9)
(3.11)
The left hand side is the electric eld (E ) in the conductor while the right hand side is the current density (j ) . E = j (3.12)
This equation is the dierential ohms law. By means of conductivity the formula is as follows: j = E (3.13)
3.4
Joules law
The power dissipated by the conductor is the time derivative of the work done by the electric eld. The measuring unit is Watt (J/s = W). P = dQ dW =U = UI dt dt (3.14)
This formula is the Joules law. Combining it with the Ohms law the following formulas can be concluded. P = U I = RI 2 = U2 R (3.15)
Let us use the formulas of U and I and substitute them to the above equation. P = U I = (El)(jA) = Ej (Al) (3.16)
The Al product is the volume of the conductor. After dividing with the volume, power density (p) can be expressed: The measuring unit is watt per cubic meter (W/m3 ). p = Ej (3.17)
This formula is the dierential Joules law. Involving the dierential Ohms law two more expressions can be found: p = E 2 p = j 2 (3.18)
3.5
Microphysical interpretation
The charge carriers collide frequently with the ion lattice in the conductive material. Between collisions they are accelerated by the electric eld. So the motion consists of short acceleration periods and sudden stops. The resulting motion can be characterized 49
by the average speed which is called the drift velocity (vdrif t ). Surprisingly this value is very small, roughly one meter per hour. Experiments show that the drift velocity is proportional to the electric eld aecting the conductor. The coecient is called the mobility (). vdrif t = E (3.19)
Consider a piece of conducting material with cross sectional area A. The material contains charge carriers with the density n and with charge q . The innitesimal amount of charge transferred by the material in innitesimal time period is as follows: dQ = vdrif t dt A nq The current can be expressed: dQ = I = vdrif t A nq dt The current density can also be calculated: j = vdrif t nq Now we substitute the mobility: j = nq E (3.23) (3.22) (3.21) (3.20)
Compare this result with the dierential Ohms law. This provides a microphysical substantiation to the conductivity, which was introduced earlier as a phenomenological material parameter. = nq (3.24)
Accordingly the conductivity of some material depends on two major factors such as the mobility and the density of the charge carriers. The individual conductivity of all kinds of charge carriers are summarized provided several types of charge carriers are involved in the current. If the temperature of the material is increased the conductivity can either increase or decrease. Increase in the conductivity occurred if the generation of the charge carriers is the dominant eect (mostly semiconductors). The conductivity decreases once the reduction of the mobility is the dominant eect (mostly metals).
50
In chapter 3 the denition of Ampere is based on the attractive force between two identical parallel currents. In this chapter the current values can be dierent. One of the currents is considered the source current (I ) while the other one is the test current (i). Accordingly the intensity of force (F ) is proportional to the currents (I, i) and to the length (l) of the wire, while it is reversely proportional to the separation (r) of the parallel wires. To create equation from the proportionalities a coecient is introduced (0 /2 ). F = 0 Ii l 2 r (4.1)
The parameter 0 is a universal constant in nature and this is called the permeability of vacuum. The numerical value is 4 107 Vs/Am. Experiments showed that the test current (i) and the length (l) of the test wire are proportional to the force. Accordingly the force can be written as follows: F = 0 Ii l = Bil 2 r (4.2)
Here B is a coecient which is determined solely by the source current (I ), and somehow characteristic to the magnetization level of the space (magnetic induction eld) generated by the source current. 0 I =B 2 r Vs = T esla m2 51 (4.3)
Figure 4.1: Parallel currents: side view(left) and upper view (right) Up to this point this almost looks like if B were a scalar. This is not the case. The B will be in fact the vector of the magnetic induction (B) with the denition below: Let us study the upper view of the currents (Fig 4.2). Due to the cylindrical symmetry of the innite straight current the magnetic induction lines are supposed to be circles around the current. Let us attribute right hand screw turning direction to the lines relative to the current direction. Accordingly when current ows out of the sheet the magnetic induction lines go around current counter clockwise (CCW).
4.2
On the other hand the attractive force vector points toward the source current. This direction is perpendicular both to the direction of B vector and the test current. This relation implies the application of vector product as a mathematical means. F = il B (4.4)
The above formula accurately describes both the direction and the intensity of the force. The direction of the current is turned to the direction of the magnetic induction and the right hand screw turning will determine the direction of the force. The test wire is not necessarily straight, this can be any curve. Very small (innitesimal) section of a curve can be considered straight, so the above formula is valid for the 52
innitesimal contribution to the force. dF = idl B The total force results as the curve integral of the contributions. F=
g
(4.5)
idl B
(4.6)
The validity of the earlier formula can be extended to point charges traveling in the space. If a point charge moves this is equivalent with a certain current. This relation is summarized below: idl = jA dl = v Adl = v Adl = vdQ (4.7)
Here we used the expression of current density by means of the charge density and the velocity: j = v The above expression can be substituted to expression of force: dF = dQ(v B) (4.8)
In case of point charge there is a denite amount of charge and so the force is a denite vector too. F = Q(v B) (4.9)
This formula describes the Lorentz force. Accordingly the magnetic eld may aect a charged particle only when it moves. Standstill particle does not feel the magnetic eld. If the above formula is divided with the charge the Lorentz electric eld (EL ) is the result. EL = v B (4.10)
This quantity will be used extensively in connection with the motion related electromagnetic induction phenomena.
4.2.1
Cyclotron frequency
Consider homogeneous magnetic eld (B = 0.1 Tesla ). Inject a proton (mp = 1.67 1027 kg, qp =1.6 1019 As) normal to the magnetic eld with initial kinetic energy (U0 = 1 keV). Determine how the particle moves in the eld. The Lorentz force is always normal to the velocity therefore the speed (and so the kinetic energy) of the particle is constant. The Lorentz force generates only centripetal 53
acceleration, and this way the particle goes around a circular trajectory. The parameters of the motion can be determined by means of the equation of motion: qvB = m v2 r (4.11)
The equation above is written in the radial direction of the circle. The left hand side is the absolute value of the Lorentz force while the right hand side is the mass multiplied with the centripetal acceleration. After some ordering: v 1.6 1019 0.1 rad qB = = cyclotron = = 1.53M Hz = 9.58 106 27 m r 1.67 10 s The velocity can be calculated from the initial kinetic energy: 1 2 mv = qU0 2 2qU0 = m 2 1.6 1019 103 m 4.38 105 27 1.67 10 s (4.13) (4.12)
v=
(4.14)
The radius of the circulation can be expressed: r= mv v 4.38 105 = = = 4.57 102 m = 4.57cm qB c 9.58 106 (4.15)
Note the cyclotron frequency is independent of the energy of the particle. This feature made possible to construct the rst particle accelerator (1932 Ernest Lawrence). The charged particles are forced to circulate by means of homogeneous magnetic eld. They are accelerated with a high frequency electric eld which is in resonance with the cyclotron frequency. As the energy of the particles grew the radius of the circulation increased but the cyclotron frequency did not change so the resonance stayed. The particles could be accelerated as long as the classical approach is worked. At higher energies the relativistic description is necessary. If the injection of the particle is not fully perpendicular to the B vectors then the initial velocity should be decomposed to parallel and normal components. The parallel component is unaected by the magnetic eld while the normal component generates uniform circulation with the cyclotron frequency. Ultimately the trajectory of the particle is twisted around the magnetic induction lines. This feature is used extensively in plasma generation techniques when additional external magnetic eld is used to increase the eciency of the ionization by increasing the path length of the charged particle. Longer the path length higher is the probability of the collisions thus the ionization.
54
4.2.2
The eect was discovered by Edwin Hall in 1879. Consider a layer of a conducting material in the form of a stripe. The direct current (I ) ows parallel with the longer dimension. Homogeneous magnetic eld (B ) crosses the material normal to the surface. The Hall voltage is measured between the two sides of the stripe.
Figure 4.2: Hall eect The following pieces of information are at disposal: I =jld [A] j = vdrif t nq A m2 (4.16)
Here l and d are the width and the thickness of the stripe respectively. I = vdrif t nq l d The drift velocity can be expressed: vdrif t = I nq l d (4.18) (4.17)
55
The absolute value of the Lorentz electric eld is merely the product of the factors due to the perpendicular arrangement. EL = vdrif t B (4.19)
The generated Hall voltage is the product of the width (l) and the Lorentz eld intensity. Integration can be omitted because the eld is homogeneous. UL = vdrif t B l = BI 1 BI BI = = RH nq d nq d d (4.20)
The formula shows that the polarity of the Hall coecient depends on the charge carrier polarity. In all other experiments electrons travel from left to right makes the same eect when positive particles travel from the opposite direction. So one never knows just from the current, which is the case in fact. This is the only experiment in which the polarity of the charge carrier makes a qualitative dierence. In modern electronic devices Hall detector is used mostly for measuring magnetic eld. It is also used as commutators in electric motors and in the ignition system of the cars.
4.3
Magnetic dipole
Consider a circular loop current (I ) with given radius (r) in the x, y plane of the Cartesian coordinate system. The loop current is surrounded by homogeneous magnetic (B) eld in arbitrary direction. The innitesimal force vector aecting an innitesimal section of the loop is as follows:
56
(4.22)
(4.23)
Now we use the formula for triple product: a (b c) = b(ac) c(ab) Accordingly: dM = Idr(B r) B(r dr) (4.24)
The second term is zero because the r and dr vectors are perpendicular. So ultimately the torque to be integrated is the following: dM = Idr(B r)
2 2
M=
0
dM = I
0
(B r)dr
(4.25)
Next the formulas to be substituted: B = Bx i + By j + Bz k r = i r cos + j r sin + 0 k dr d = r(i sin + j cos )d d (4.26) (4.27)
dr =
(4.28) = j2 =
M = Ir
2 0
(4.29)
The formula is further simplied by the orthogonal sine and cosine functions. The only remaining terms are either pure sine or pure cosine functions. 2 2 M = Ir2
0
(4.30)
57
cos2 d =
0 0
sin2 d =
(4.31)
M = Ir2 (Bx j By i) = Ir2 (By i + Bx j) Here one can discover the area of the circle. A = r2 Finally the formula of torque emerges. M = IA (By i + Bx j)
(4.32)
(4.33)
This formula is not easy to handle let alone to remember that. If one attributes vector character to the area as already had been, the above formula can be interpreted much more elegant way. M = IA B Let us check this formula. i j k 0 A M=I 0 Bx By Bz = IA (By i + Bx j) (4.35) (4.34)
This is a perfect match. Due to historical reasons the formula of torque is modied with the permeability of vacuum. : M = 0 I A B 0 (4.36)
The rst factor in the cross product is the magnetic dipole moment (m) of the current loop. m = 0 I A [V sm] (4.37)
The second factor is denoted H called magnetic eld measured in A/m unit. Its physical meaning will be explained later in the next chapter. M=mH (4.38)
The magnetic dipole moment is turned into the direction of the external magnetic eld spontaneously and stays there. Having reached this position, the least amount of potential energy is stored in the magnetic dipole. Obviously the most amount of potential 58
energy stored is just in the opposite position. Let us nd out the work needed to turn the dipole from the deepest position to the highest energy.
W =
0
M d =
0
mH sin d = mH
0
(4.39)
According to this result the potential energy of the magnetic dipole is as follows: Epot = m H (4.40)
This formula provides the deepest energy at parallel spontaneous position and the highest at anti-parallel position. The zero potential energy is at ninety degrees. The dierence between the highest and lowest is just the work needed to turn it around.
4.4
Magnetic dipole moment of a solenoid is directed according to the right hand screw rule. This means that rotating parallel with the circulation of the current, the progress of the right hand screw denes the direction of the magnetic dipole moment. Let us suspend the solenoid in its center of gravity on a thin thread which provides free turning in the horizontal plane. The solenoid slowly turns parallel to the Earths magnetic eld such a way that the magnetic dipole moment points to the geographic North Pole. If a permanent magnet rod is suspended in the same way as the solenoid, this will also turn parallel to the Earths magnetic eld.
Figure 4.4: The geographic North Pole of the Earth is in fact a magnetic South Pole 59
In case of an ordinary dipoles such as a solenoid or a bar magnet the magnetic dipole moment is directed from the south end to the north end. The magnetic induction lines are virtually exiting from the north end and entering into the south end. Planet Earth is an exceptional magnetic dipole because the geographic North Pole is in fact a magnetic South Pole. This weird-looking switch is required to dissolve the contradiction of naming the poles of an ordinary dipole. That end of an ordinary dipole is called north end which is closer to the north geographic pole of the Earth. However the same kinds of poles repel each other so the mentioned switch clears the situation.
4.5
Biot-Savart law
The magnetic eld of an innitesimal current element is described by the Biot-Savart law. dH = I dl r 0 4 r2 (4.41)
Figure 4.5: Biot-Savart law The innitesimal current element is surrounded by circular magnetic eld. The rotation of the magnetic eld is in accordance with the right hand screw rule. In the equatorial plane the magnetic eld diminishes with the negative second power (just like Coulombs law). Below and above the equatorial plane the magnetic eld diminishes with the increasing angle and vanishes on the line of the current element. The innitesimal magnetic contributions can be superimposed and the overall magnetic eect of any extended current can be calculated by integration. The BiotSavart law is used typically for currents in a thin wire where the integration by line provides a fair result. 60
4.5.1
Figure 4.6: Biot-Savart law for a straight wire The following pieces of information are at disposal: dH = I dl r0 4 r2 |dl| = rd cos r= R cos (4.42)
All contributions of the magnetic eld point to the same direction therefore the integration of the absolute value is satisfactory. |dH| = I rd cos I d I cos I |dl| |r0 | sin( + 900 ) = = = d 2 2 4 r 4 cos r 4 r 4 R (4.43)
After some simplications the innitesimal contribution results: I cos d 4R The integration will be carried out for symmetrical half visual angle domain. dH =
(4.44)
H=
(4.45)
The magnetic eld of a nite current section under symmetrical half visual angle domain can nally be expressed. H= I sin 2R 61 (4.46)
If the current tends to the innity then tends to ninety degrees so sinequals unit. This way for innite long lament the result is as follows: H= I 2R (4.47)
4.5.2
Figure 4.7: Regular polygon Consider a part of a regular polygon with sides nwhich carries a current I . The containing circle has the radius R. The half central angle is /n. The magnetic eect of one side is denoted H . For other notations see the gure above. The following pieces of information are at disposal: H = After substitution: Hn = n ) I 1 I n sin( I n n sin( ) = tg ( ) = 2 R cos( n ) n 2R cos( n ) 2R n I n tg ( ) 2R n 3 I 2R (4.49) I sin( ) r = R cos( ) Hn = n H 2r n n (4.48)
So altogether the magnetic eld in the center of the n sided regular polygon is as follows: Hn = Please nd some numerical values: H3 = I 2R tg 600 = 1.65 (4.51) (4.50)
62
H4 =
I 2R I 2R I 2R
4 5 6
tg 900 = 1.27
I 2R I 2R I 2R
(4.52)
H5 =
tg 360 = 1.15
(4.53)
H6 =
tg 300 = 1.10
(4.54)
If the number of sides tends to the innity then the polygon will tend to the circle. The limit value of the big parenthesis is unit. Ultimately the magnetic eld in the center of the circle is equal with the current over the diameter (worth remembering). Hcircle = H = I 2R (4.56)
4.6
Amperes law
I 2r
The magnetic eld of the innite long lament was reached in this chapter above. H= (4.57)
One can transform this formula in the following way: 2r H (r) = I (4.58)
If the circumference of a circle is multiplied with the actual magnetic eld, the result will be independent of the radius and will be equal with the current. Since the magnetic eld is constant on a radius one may write the above equation also by means of curve integral on the circle. H(r)dr = I
circle
(4.59)
This equation prompts the following hypothesis. May be the integration path needs not to be a circle but this could be any closed loop around the current. This hypothesis is 63
proven and it is called the Amperes law. The current may even be the sum of several currents. The direction of integration determines the direction of positive currents based on the right hand screw rule. H(r)dr =
loop
(4.60)
Figure 4.8: Amperes law proof The gure shows an innite straight current normal to the sheet of paper pointing up just in front of our eyes. An arbitrary closed loop surrounds the current which is the path of the line integral. The current is also surrounded by several concentric circles in equidistant steps. The path of the integration can be approximated by very small (innitesimal) sections which are either in tangential or in radial positions relative to the concentric circles. This way the integration can be carried out by moving on any of the circles or by moving radial direction. No contribution is generated in radial direction since the dot product vanishes at perpendicular position. On the circles however the contribution is the product of the magnetic eld and the length of the arc. The corresponding central angle is donated . H(r)dr = (1 r1 )
loop
(4.61)
64
I (1 + 2 + 3 + ...... + rn ) = I 2
(4.62)
The sum of the central angles stacks up to 2 due to the closed loop. So altogether the angles cancel out. Ultimately the statement to be proven is the result. Q.E.D. Application of Amperes law for solving problems requires similar considerations to that of the Gausss law. Amperes law is an integral law. If one wants to use it for nding out local magnetic eld, the symmetries or regularities of the magnetic eld must known prior to the application. If so, one has to choose the path of the integration in which the magnetic eld is constant thus the integral converts to a simple product. The actual magnetic eld results after dividing with the length of the integration path.
4.6.1
Consider a thick metal rod wit radius R, which conducts current with uniform current density denoted j . Find the intensity of the magnetic eld as the function of radius.
Figure 4.9: Current in the thick rod The solution uses Amperes law. A circle is inated in radius from zero to the innity. H(r)dr =
loop
(4.63)
65
Figure 4.10: H (r) function Inside there is a linear slope of the magnetic eld intensity. Outside, the intensity decays like a hyperbola. The function is continuous on the surface.
4.6.2
Solenoid
This is a straight rod coil. The physical model of solenoid requires the length to be roughly ten times longer then the diameter.
Let us use Amperes law. The magnetic eld is homogeneous in the cavity of the solenoid. A closed path is chosen which is parallel with the coil and located in the cavity on one side. The front side of the path is outside the coil where there is no magnetic eld. The remaining two little sides of the rectangle are normal to the magnetic eld thus can be ignored. The length is denoted l and the number of turns is N . So ultimately the Amperes law emerges in a simple form: H(r)dr =
loop
(4.64)
Hl = N I NI l
(4.65)
H=
(4.66)
4.6.3
Toroidal coil
This is a doughnut shape coil. The physical model requires the circumference to be roughly ten times longer then the diameter.
67
Let us use Amperes law. The magnetic eld is homogeneous in the cavity of the coil. A closed circular path is chosen which is running in the central of the cavity. The circumference is denoted l and the number of turns is N . So ultimately the Amperes law emerges in a simple form: H(r)dr =
loop
(4.67)
Hl = N I NI l
(4.68)
H=
(4.69)
4.7
Magnetic ux
The general concept of ux has been treated earlier. This is a scalar value surface integral of some vector eld. The vector eld in present case is the eld of magnetic induction B(r). The magnetic ux as follows: m =
g
BdA
[V s]
(4.70)
In case of solenoid and toroid the formula of the magnetic turn ux is as follows: B= 0 N I l turn = BA = 0 N IA l (4.71)
Later coil ux will also be used in conjunction with the induced voltage of the coil. : coil = 0 N 2 IA l (4.72)
68
5.1
Non-physicist public opinion divides the materials such as magnetic and non-magnetic materials. The former one is mostly iron and some alloys which are attracted by magnet and the latter ones are the rest of the materials which are seemingly unaected by magnet. In reality however, all materials are aected by the magnetic eld though the intensity of the attraction varies several orders of magnitude. The commonly mentioned magnetic materials are in fact the ferromagnetic materials in technical terms. The nonmagnetic materials can be classied to two distinct groups such as paramagnetic and diamagnetic materials in which the intensity of the interaction is so low that it is simply overlooked by the easy observer. The following experiment makes it possible to distinguish between the sorts of the magnetic behavior: Make little samples of the materials to be tested. The sample geometry is roughly fty millimeter long and ve millimeter in diameter cylindrical rod. In the symmetry axis there is an indentation normal to the rotational axis which contains a needle on which the rod can be rotated freely. The depth of the indentation is roughly eighty percent of the rod diameter. The rotatable sample is placed into the air-gap of the unexcited toroidal electromagnet which can create high intensity homogeneous magnetic eld. Now switch the electromagnet which creates a magnetic induction in the order of magnitude 0.01 Tesla at least. The samples of dierent materials will behave as shown 69
Fig.5.1 Fig.5.2 Fig.5.3 copper, mercury, gold, bis- aluminum, chromium, iron, nickel, cobalt and muth platinum, tungsten their alloys
Some samples orient themselves diagonally (perpendicularly) to the direction of the magnetic eld. These materials are called diamagnetic materials (Fig 1). Some other materials orient themselves parallel to the magnetic eld. These are the paramagnetic (Fig 2) and ferromagnetic (Fig 3) materials. The intensity of the interaction can be estimated based on the dynamics of the turning. The most sluggish turning happened at the diamagnetic materials. The paramagnetic material turned somewhat more agile but still slow. The turning reaction of ferromagnetic material was instantaneous relative to the others. The estimated intensities of the torques are roughly one, ten and several millions respectively. So far the experimental distinction has been carried out. The microphysical interpretation of the experimental results should provide the understanding of the phenomena. The roots of the interpretation come from the atomic structure of the material. Some materials contain atoms without any magnetic dipole moment. Upon placing these materials into the magnetic eld the atoms become weak dipoles. Due to quantum mechanical reasons the direction of the dipole moment will be just opposite of the external magnetic eld. This way the sample becomes a dipole of opposite direction to the external magnetic eld. Now the external eld wants to turn the dipole parallel with itself. As the turning goes on the atomic dipoles also turn their orientation again against the external eld. So the only position where the sample gets rest is the perpendicular position where virtually no torque aects the sample. This behavior is characteristic of the diamagnetic materials. Atomic dipoles are originally located in the paramagnetic materials. Once external magnetic eld emerges, the atomic dipoles in the material orient themselves into the 70
direction of the eld, thus the sample becomes a magnetic dipole. In the former chapter the fact has been presented that the magnetic dipoles turn to the parallel direction to the external eld. This way the sample is positioned as shown in the gure two. The diamagnetic eect also shows up in paramagnetic materials but it is overcompensated by the paramagnetic eect. From technical point of view the most important type, the ferromagnetic material is treated nally. There are atomic dipoles in the material similar to the paramagnetic materials, but these atomic dipoles interact with each other in contrast to the simple paramagnetic case where the dipoles are aected by the external eld alone. Due to the interaction, the dipoles orient themselves parallel with each other and create the magnetic domains. The size of such domains is roughly in the order of micrometers, which in terms of atomic dimensions is large, though in terms of macroscopic dimensions is still rather small. Such material is magnetically neutral since the domains are oriented randomly, this way the eects of domains average out to zero. When the external magnetic eld is switched on the domains get oriented, and very high magnetic eld is generated. The magnetization of the material is proportional to the relatively low external magnetic elds. At high external elds however the magnetization gets saturated since the domains have all been oriented. This phenomenon called the hysteresis. Another interesting fact is related to the Curie temperature. Above this temperature the material looses the ferromagnetic behavior and reverts to be paramagnetic. In the case of iron the Curie temperature is 770 degrees Celsius. The relatively high temperature breaks the bonds of the interaction between the dipoles, and domains are disintegrated. Later in this work the discussion of magnetic phenomena is limited to ferromagnetic materials in the linear magnetization range when the magnetization and the external eld are proportional. Hysteresis phenomena will not be treated in detail.
5.2
Take a solenoid coil with empty cavity. Switch on the DC current and feel how strong the magnetic force is by approaching it with an iron screwdriver. Now place the iron core into the cavity and check the force again. The experiment conrms that the force is much stronger than previously. The conclusion can be drawn readily that the presence of iron core increased the intensity of the force, in contrast to the electrostatic phenomena where the presence of dielectric material diminished the intensity of the force. This antisymmetry is rooted in the fact that the parallel currents and the opposite charges attract each other. A simple model will be discussed here suggested by Ampere: The atomic dipole is equivalent with a tiny loop current. So the magnetic material is assumed to be lled with such lop currents in random orientation, so their total magnetic moment is zero. By switching the external DC eld on, they get oriented and the loop 71
currents will all circulate in the same direction. Let us consider a point within the iron core. Due to symmetry reasons same amount of current goes through this point from right to left and opposite. Inside the volume of the core the eects of loop currents neutralize each other. The situation is very much dierent on the surface of the core where the direction of the loop currents is parallel with the coil current. Because of this, the core will behave as it was a coil in which the so called magnetizing current (Im ) ows. The direction of the coil current and the direction of magnetizing current are the same. The magnetic eld of both the coil current and that of magnetizing current are as follows: H= NI l Hm = Im l (5.1)
Here N is the number of turns, I and Im are the coil current and the magnetizing current nally the l is the length of the solenoid. Due to the same direction the total magnetic eld is the sum of these: Htot = H + Hm Let us multiply the above equation with 0 . 0 Htot = 0 H + 0 Hm (5.3) (5.2)
The left hand side of the equation is the eld of the magnetic induction B . The second term on the right hand side is magnetization M,accordingly this can be written: B = 0 H + M 72 (5.4)
The deduction of this formula was made in a special geometry for the sake of simplicity. However the result is true in full generality for vectors as well. B = 0 H + M Vs m2 (5.5)
The meaning of the above formula can be summarized as follows: The magnetic eld (H) is generated by the coil current alone, while the vector of magnetization (M) is generated solely by the magnetizing current. The vector of magnetic induction (B) contains the eects of both the coil current and the magnetizing current. The emerging torques and forces are determined by the B eld. The experiment showed that the created magnetization (M) is proportional to the external eld (H) provided no saturation happens. Proportionality can be transformed to equation by introducing a coecient which is call the magnetic susceptibility (m ). M = 0 m H Let us substitute this into the former one: B = 0 H + M = 0 H + 0 m H = 0 (1 + m )H Here we introduce the concept of relative permeability (r ). r = 1 + m By means of this the nal equation can be written: B = 0 r H (5.9) (5.8) (5.7) (5.6)
5.3
In this section the vector calculus will be used at somewhat higher level. The rotation operation (rot ) generates a vector eld which represents the vortexes of some vector eld. V(r) = Vx (x, y, z )i + Vy (x, y, z )j + Vz (x, y, z )k i rotV(r) =
x
(5.10)
j
y
k
z
(5.11)
Vx Vy Vz
73
(rotV)dA
(5.12)
Let us divide the following equation with 0 : B = 0 H + M B M =H+ 0 0 Generate the rotation of the equation above: rot( B M ) = rotH + rot( ) 0 0 (5.15) (5.13) (5.14)
Each term in the above equation can be interpreted separately based on the rst Maxwell equation. The current densities are related both to the conductive current in the coil and to the magnetizing current. rot( B ) = jtot 0 rotH = jcoil rot( M ) = jmagn 0 (5.16) (5.17)
jtot = jcoil + jmagn Stokes theorem generates integral form from the relations above: (
g
B )dr = Itot 0
g
Hdr = Icoil
g
M )dr = Im 0
(5.18)
The central integral above is the well-known form of Amperes law. This expresses that the curve integral of the H vector on a closed path (g ) equals the amount of the conductive current (current in a wire) surrounded by the g path. The integral on the right expresses that the curve integral the magnetization vector (M/ 0 ) equals the total magnetizing current surrounded by the g path. Finally the left hand side states that the curve integral of the magnetic induction vector (B/ 0 ) is equal with the total current (conductive and magnetizing) surrounded by the g path.
5.4
Consider two dierent magnetic materials with plane surface. The plane surfaces are connected thus creating an interface between the materials. This structure is subjected to the experimentation. 74
First the B eld is studied. The interface is contained by a symmetrical disc-like drum with the base area A. The upper and lower surface vectors are A1 and A2 respectively. A1 = A2 |A1 | = |A2 | = A (5.19)
Since magnetic monopoles do not exist the B eld does not have sources. So the ux of the B eld to a close surface is necessarily zero (Maxwell equation 3.) BdA = B1 A1 + B2 A2 = 0
S
(5.20)
B1 A2 = B2 A2
(5.21)
Figure 5.6: The B eld on the interface of dierent magnetic materials The operation of dot product contains the projection of the B vectors to the direction of A2 vector which is the normal direction to the surface. The subscript n means the absolute value of the normal direction component. B1n A2 = B2n A2 Once we are among real numbers the surface area cancels out readily. B1n = B2n According to this result the normal component of B face of magnetic materials. 75 (5.23) vector is continuous on the inter(5.22)
Figure 5.7: The H eld on the interface of dierent magnetic materials The interface is surrounded by a very narrow rectangle-like loop with sections parallel and normal to the surface. The parallel sections of the loop are s and s vectors. The normal direction sections are ignored due to the innitesimal size. The closed loop integral of the H eld equals the total conductive current contained by the loop according to Amperes law. Here there is no such current so the right hind side of the equation will be zero. Hdr = sH1 + (s)H2 = 0
g
(5.24)
sH1 = sH2
(5.25)
The operation of dot product contains the projection of the H vectors to the direction of s vector which is the tangential direction to the surface. The subscript t means the absolute value of the tangential direction component. sH1t = sH2t (5.26)
Once we are among real numbers the length of the tangential section cancels out readily. H1t = H2t According to this result the tangential component of the H the interface of magnetic materials. (5.27) vector is continuous on
76
5.5
Demonstration example
A conductive rod with a radius (R1 = 10cm ) made of copper carries a uniform current density (j =10 5 A/m2 ). The rod is surrounded by magnetic coating (r = 103 ) up to the radius (R2 = 15cm). Find and sketch the radial dependence ofH, B and M vectors. Determine the numerical peak values in the break points and nd the amount of the magnetizing current. First parameter to be calculated is the magnetic eld (H ). The tangential component of the H eld is continuous on the interface of the magnetic materials. In our case the circular magnetic eld is tangential to the surface of the magnetic coating, therefore the H eld is unaected by the presence of the coating. In the rod 2r H (r) = r2 j j r H (r) = 2 5 j A H (r = R1 ) = 2 R1 = 10 0.1 = 5 103 m 2 ICond = 2R1 H (r = R1 ) = 3140A Out of the rod 2 j 2r H (r) = R1 2 R j 1 H (r) = 2 r j H (r = R1 ) = 2 R1 =
105 0.1 2
A = 5 103 m
Figure 5.8: The magnetic eld (H ) as the function of distance The peak value of the magnetic eld (H ) can be calculated as above. It shows that the function is continuous.
77
Figure 5.9: The magnetic induction (B/0 ) as the function of distance The B/0 function is enlarged to r times greater value where magnetic material is present. Diculty lies in the drawing of the gure due to the huge (r =103 ) multiplier. The peak value of B/0 is 5 10 6 A/mat r = R1 radius. The corresponding B value is 6.28 Tesla. Magnetization results as the dierence of the above functions: B M = H 0 0 (5.28)
The peak value of M/0 is 4.995 10 6 A/m at r = R1 radius. The corresponding M value is 6.277 Tesla.
M )dr = Imagn 0
(5.29)
The g curve of the integration is the circle with R1 radius. The magnetizing current can be expressed as follows: Im = 2R1 M = 2R1 0 B (R1 ) H (R1 ) 0 = 2R1 1 j j 0 r R1 R1 0 2 2
2 = R1 j (r 1)
(5.30) The cross sectional area is multiplied with the current density. This is obviously the conductive current (Icond ) in the rod. The magnetizing current is as follows: Im = ICond (r 1) = 3140 999 = 3.14 106 A (5.31)
The magnetizing current is virtual current on the surface of the magnetic material. At R1 radius the magnetizing current is in parallel direction with the conductive current while at R2 radius the magnetizing current ows in opposite direction. At bigger radii than R2 , the eects of two opposite direction magnetizing currents compensate each other so the magnetization intensity drops to zero.
5.6
In the former chapter at 5.2 section the empty solenoid coil has already been treated. Now the cavity contains the iron core which is characterized by r value. The physical model of solenoid requires the length to be roughly ten times longer then the diameter. Let us use Amperes law. The magnetic eld is homogeneous in the solenoid. A closed path is chosen which is parallel with the coil and located in the cavity on one side. The front side of the path is outside the coil where there is no magnetic eld. The remaining two little sides of the rectangle are normal to the magnetic eld thus can be ignored. The length is denoted l and the number of turns is N . So ultimately the Amperes law emerges in a simple form: H(r)dr =
loop
Hl = N I NI H= l 79
Figure 5.11: Solenoid with core The generated magnetic induction is as follows: B = 0 r NI l (5.35)
The generated turn ux and coil ux values can be expressed: turn = BA = 0 r N IA l coil = BA N = 0 r N 2 IA l (5.36)
The air gap is perpendicular to the magnetic eld in the coil therefore the magnetic induction is continuous on the surfaces of the air gap since the normal component of the B eld is always continuous on the surface of magnetic material.. That means the B magnetic induction is constant all around the coil. Biron = Bair = B Let us use the Amperes law: Hiron l + Hair = N I (5.38) Hair = B 0 Hiron = B 0 r (5.37)
The letters l and are the circumference of the coil and width of the air gap respectively. After substitution: B B l + = NI 0 r 0 From here B can readily be expressed: B= 0 N I l + r (5.40) (5.39)
The generated turn ux and coil ux values can be expressed: turn = BA = 0 N IA l + r coil = BA N 0 N 2 IA l + r (5.41)
81
6.1
6.1.1
The plane generator is a hypothetical device which is unpractical to use in its original form, however it is capable of demonstrating the operation some practically used generators. The physical principles of operation are clearly apparent without the disturbing technical details.
Figure 6.1: Plane generator The plane generator consists of two parallel conductive rails and a similarly conductive crossbar perpendicular to the rails. The crossbar can travel freely on the rails while 82
staying in galvanic contact with both. A resistor of resistance R is also connected between the rails. The whole setup is placed into homogeneous B magnetic eld which points into the plane of the paper. Let us move the crossbar with uniform v velocity parallel with the rails. Let the velocity vector point to the right hand side direction. In the rst part of the experiment the S switch which connects the resistor to the setup is open. The generated Lorentz electric eld is as follows: EL = v B (6.1)
According to the vector product the Lorentz electric eld points upside direction. The Lorentz electric eld pushes the positive charge carriers to upside direction so the upside terminal is the positive one. The result is the same if electrons are considered as charge carriers. This time the electrons are pushed downside direction making the downside terminal negative which matches the earlier result. The v and B vectors are normal to each other so the absolute value of the result is merely the product of the absolute values: EL = v B (6.2)
The absolute value of the induced voltage can be calculated without any integration by a simple product. Uind = EL l = B l v (6.3)
Here the distance between the rails is denoted l. The absolute value of the induced voltage can also be calculated in the following way: The magnetic ux aecting the setup is the product of the magnetic induction (B ) and the active area (A). = BA = Blx The time derivative of the above formula is as follows: dA dx d =B = Bl =Blv dt dt dt (6.5) (6.4)
The result matches the absolute value of the induced voltage. The polarity of the result should be considered separately. If the velocity vector points to the right hand side direction then the active ux increases due to the increasing area. The increasing ux points into the paper so the generated electric eld supposed to show a clockwise rotation. In contrast to this the rotation of the electric eld is counter clockwise as it has been
83
shown in the rst part of this argument. Altogether one can summarize the conclusion in the following formula: Uind = d dt (6.6)
This formula is the famous Faraday induction law. This is true for all kinds of induction processes. Now let us return to the discussion of the plane generator by closing the S switch, this way applying a load to the generator. Current owing through the resistor is denoted i. i= Blv Uind = R R (Blv )2 R (6.7)
The electric power generated Pel is the following: Pel = Uind i = (6.8)
The induced current ows through the crossbar. The current and magnetic eld interact according to Lorentz law. FL = il B (6.9)
The direction of the Lorentz force is just the opposite of the velocity. If I move the crossbar with my hand I have to overcome the Lorentz force. This is the point to mention Lenzs law which states that following: The direction of the induced current is determined accordingly, that by means of its magnetic eld, the induced current always opposes the original change in the magnetic ux. So if the ux is increased by my hand, the induced current opposes my hands motion. My force will be parallel direction with the velocity. This way I make positive power on the system. FL = il B = Blv lB R (Blv )2 R (6.10)
The positive power exerted to the system is the product of the force and the velocity: Pmech = FL v = (6.11)
The amount of the electrical power matches the formula of the mechanical power. This means that the mechanical power required to move the crossbar against the force of the magnetic eld is equal to the electrical power which heats up the resistor. Based on the principle of the plane generator there are practically usable generator types such as the Drum generator and the Unipolar generator both generating DC voltage. 84
6.1.2
The rotating frame generator is a hypothetical device which is unpractical to use in its original form, however it is capable of demonstrating the operation some practically used generators. The physical principles of operation are clearly apparent without the disturbing technical details.
Figure 6.2: Rotating frame generator The rotating frame generator consists of some kind of wire frame with the surface area A without any special condition for the shape of the frame. The frame is rotating around an axis which is expanded between two diagonal points of the frame. The axis is positioned perpendicular in a homogeneous magnetic eld. On the rotational axis there is a pair of sliding rings which is solidly connected to the two ends of the cut wire frame. The sliding connectors are hooked up to a resistor of resistance R through an S switch which is open during the rst part of the experiment. The ux in the frame as the function of time can be written easily: (t) = BA cos(t) Let us use the Faraday induction law: d = BA sin(t) dt The induced current is expressed by Ohms law: Uind = i= Uind BA = sin(t) R R 85 (6.13) (6.12)
(6.14)
The electrical power generated is as follows: Pel = Uind i = (BA )2 sin2 (t) R (6.15)
Let us check out the mechanical power required to rotate the generator. The torque M is aecting a magnetic dipole in a B magnetic eld. It has already been discussed in chapter 4. M = iA B The absolute value of the torque is as follows: M = iAB sin(t) (6.17) (6.16)
The mechanical power exerted to the system is the product of the torque and the angular velocity of the rotation. Pmech = M = (BA )2 BA sin(t) AB sin(t) = sin2 (t) R R (6.18)
The nal formula of the mechanical power completely matches the formula of the electrical consumption. So altogether the situation is clear. The torque of my hand which rotates the frame generator overcomes the opposition of the induced current according to Lenzs law. The generated power has been consumed in the resistor by warming it up. Generators which supply AC voltage into the electrical energy systems operate on the principle of the rotating frame generator.
6.1.3
Eddy currents
If the magnetic eld changes over time inside of a conductive medium, the generated electric eld gives rise to loop currents which circulate in the medium. These are the eddy currents which cause energy dissipation in the medium. The direction of the current is determined by Lenzs law. Swinging rings experiment The rings are made of aluminum with an approximate diameter of twenty centimeters. One of them has a thin cutting so this ring is not continuous all around. The rings are suspended according to the gure. A bar magnet is pushed back and forth into the ring. The ring with the cutting is unaected by the periodic motion of the magnet. However the intact ring gradually starts to swing if the operator moves the magnet in synchronism with the oscillation frequency of the pendulum. 86
Figure 6.3: Swinging rings experiment Explanation: The motion of the bar magnet causes the variation of the ux in the ring. The induced electric eld generates the induced loop current (eddy current). The magnetic eld of the induced loop current opposes the original eect according to Lenzs law. The original eect is the motion of the bar magnet which can not be stopped, therefore the ring starts to swing by the periodic eect of the braking force. If the ring with the cutting is taken, no eect will show up, since the cutting inhibits the formation of the eddy current. Waltenhofen pendulum
87
A pendulum is made with an aluminum work piece on its end according to the gure. One of the work-pieces is intact the other one is having comb-like cuttings. The intact work-piece is swinging between the jaws of the electromagnet which is inactive. Let the pendulum swing and observe that the attenuation of the motion is insignicant. Now switch DC voltage to the electromagnet. The swinging will stop completely in three oscillations. Now replace the work-piece for that with the cuttings. By repeating the experiment the attenuation does not appear on switching the magnet. Explanation: When the intact piece was moving through the magnetic eld, eddy current was generated in the work piece by the eect of the ux variation. The magnetic eect of the eddy current attenuated the swing according to Lenzs law. Once the workpiece with cuttings has been installed the attenuation failed since eddy currents have been prevented from happening.
6.2
Electromagnetic induction can also occur without mechanical motion. The primary cause of the induction process is the variation of the electric current, which in turn generates time variant magnetic eld.
6.2.1
Consider n pieces of current loops. Each of them carries a current ii and each of them contains a ux i . The ux is originated partly from its own current and partly by all other current loops.
Figure 6.5: Current loops Experience shows that the coupled uxes and the self uxes are proportional with the
88
corresponding currents so the total ux in a loop can be expressed by a linear relation. L1 i1 + M12 i2 + M13 i3 = 1 M21 i1 + L2 i2 + M23 i3 = 2 M31 i1 + M32 i2 + L3 i3 = 3 This relation is presented in the best L1 M12 M21 L2 M31 M32 way by using matrix formalism. M13 i1 1 M23 i2 = 2 L3 i3 3 (6.19) (6.20) (6.21)
(6.22)
Here the self induction coecients and the mutual induction coecients are denoted with letters L and M respectively. In the M coecients the rst subscripts indicate the current loop that received the external ux while the second subscript shows the current loop that generated the magnetic eld. Obviously the self induction coecients do not need double subscripts. The major physical point of the above description is the fact that the induction matrix is symmetrical. That means for instance M12 and M21 elements are equal. This contains the important fact that by applying current to loop 1 and measuring the ux in loop 2 the mutual induction coecient turns out to be the same when the current loop and the measuring loop are swapped. This symmetry provides the possibility to determine the mutual induction coecient in the most convenient way.
6.2.2
Consider a current loop according to the gure above. Let us assume an electric current owing down-side direction and increasing in magnitude. Therefore the time derivative of the current also points down-side direction. In the loop, both the generated ux and the time derivative of the ux point into the sheet of the paper. The induced electric eld (some earlier books call it as electromotive force) in the loop performs a counter clockwise rotation due to the negative sign in the Faraday induction law. Uind = d dt (6.23)
Accordingly, the induced electric eld pushes the positive charge carriers counter clockwise direction. This means that the positive pole of the induced voltage will be the upper pole and the negative is the lower one. The voltage drop of a two-pole component is directed from the positive to the negative pole. So the measurable loop voltage is pointing down-side direction, similarly to the direction of the time derivative of the current. Thus the measured loop voltage on the two-pole component will be the time derivative of the loop current multiplied with the self induction coecient without the negative sign. Uloop = L di dt (6.24)
If there are more loops in the setup the loop voltages can be expressed as the time derivative of the above matrix equation: L1 M12 M13 di1 /dt U1 M21 L2 M23 di2 /dt = U2 (6.25) M31 M32 L3 di3 /dt U3
6.2.3
The transformer
The transformer consists of at least two coils on a common iron core. The coil which receives the excitation is called the primary coil, while the coil which provides the output is called the secondary coil. Two geometrical arrangements will be discussed, the solenoid and the toroid. These two types of geometry have already been discussed in chapter 5 in some extent. The discussion of these types will be carried out in a uniform way with the following notations: The cross sectional area of the coil is A. The length of the solenoid and the circumference of the toroid are denoted l. The number of turns is denoted N1 and N 2 . In both cases the Amperes law emerges in a simple form: H(r)dr =
loop
(6.26)
90
Hl = N1 I N1 I l
(6.27)
H=
(6.28)
The generated magnetic induction is as follows: B = 0 r The generated turn ux can be expressed: turn = BA = 0 r N1 IA l (6.30) N1 I l (6.29)
The voltage of one turn of the coil is called the turn voltage. This is the time derivative of the turn ux. Uturn = dB 0 r N1 A dI A= dt l dt (6.31)
(6.32)
(6.33)
91
Similarly for the secondary coil the self induction coecient is as follows: L2 = The coil ux can also be expressed: 1coil = N1 turn = L1 I The secondary voltage also can be determined by means of the turn voltage: U2 = N2 Uturn = 0 r N1 N2 A dI dI =M l dt dt (6.36) (6.35)
2 A 0 r N2 l
(6.34)
An important relation can be seen easily: M 2 = L1 L2 Let us compare the primary and the secondary voltages. N2 Uturn N2 U2 = = =T U1 N1 Uturn N1 (6.39) (6.38)
The ratio of number of turns is called the transmission (T ) of the transformer. So far there was no any load on the secondary coil. Let us analyze the case when load is applied to the secondary coil. The analysis is carried out by means of the complex amplitudes of the sinusoidal quantities. Here j denotes the imaginary unit
The Kirchho loop equations are used. jL1 I1 jM I2 U = 0 RI2 + jL2 I2 jM I1 = 0 Regroup the equation. jL1 I1 jM I2 = U jM I1 + jL2 I2 = RI2 The rst and the second equations are multiplied with M and L1 respectively. jL1 M I1 jM 2 I2 = U M jL1 M I1 + jL1 L2 I2 = RL1 I2 Add the equations together. The rst term cancels out. jI2 (L1 L2 M 2 ) = U M RL1 I2 I2 j (L1 L2 M 2 ) + RL1 = U M From here I2 can be expressed: I2 = j
L1 L2 M
(6.40) (6.41)
(6.42) (6.43)
(6.44) (6.45)
(6.46) (6.47)
U M +
RL1 M
U j (M M ) +
R1 T
UT R
(6.48)
Here L1 L2 = M 2 and the L1 /M =1/T relations have been used. The generated secondary voltage is simply: U2 = RI2 = U T (6.49)
This equation proves that the secondary voltage matches the value of the case without load. This result is the conclusion of the model we are using in which the coils are free of serial resistance. The output eective power is the half value of the product between the voltage and the current. Pout 1 (U T )2 = U2 I2 = 2 2R 93 (6.50)
Now let us nd out the relations on the primary side. For this purpose the value of I1 should be determined from the initial equations. jL1 I1 jM I2 = U The value of I2 is substituted: jL1 I1 jM I1 can be expressed readily. I1 = U T 1 + j M U R + jM T U R = = jL1 1 R jL1 R I1 = R MT + jL1 L1 = U R R + T2 jL1 (6.53) UT =U R (6.52) (6.51)
U U + T2 jL1 R
(6.54)
Two conclusions can be drawn. The rst one is related to the input power. The input current above consists of two terms. The rst one is a reactant current which is in ninety degree lag relative to the voltage. This term does not produce eective power. The second term is in phase with the voltage so the eective power will be generated accordingly. (U T )2 1 Pin = U I1 = 2 2R (6.55)
The result perfectly matches the form of the output power. The second conclusion refers to the input impedance of the transformer. Let us calculate the Zin value which is the ratio of the input voltage over the primary current. Zin = U = I1
U jL1
U = T2 +U R
1 jL1
1 1 2 = +R T
R jL1
R + T2
(6.56)
Provided the frequency is high enough, the rst term in the denominator can be ignored relative to the square of the transmission. This time the input impedance is real resistance. 1 Rin = R( )2 T (6.57)
The second conclusion is the fact that the load resistance on a transformer shows up on the input of the transformer as a real resistance with the value above. So the rule of thumb can be declared, that the load is transformed to the input with the square of the transmission. 94
6.2.4
Let us increase the current in a solenoid coil from zero to some I value gradually in time. The coil reacts with an opposition to the increasing current according to Lenzs law. The induced voltage of the coil needs to be overcome in order to press through the current. This way we have to carry out positive work and nally the coil will possess magnetic energy. The induced voltage is expressed by the known formula: U =L dI dt (6.58)
Multiplying it with the current one recovers to invested power. P (t) = U I = (L dI dI )I = L(I ) dt dt (6.59)
On the right hand side a function and its time derivative are multiplied together. It is known from math that the following rule is true. f (x) 1 d 2 df (x) = f ( x) dx 2 dx (6.60)
Let us apply this rule to the original case. I This can be readily substituted. 1 d P (t) = L (I 2 (t)) 2 dt (6.62) dI 1d 2 = (I ) dt 2 dt (6.61)
Let us integrate the two sides of the equation over time with homogeneous initial conditions. Accordingly in the initial state there was no current and no energy stored in the coil.
t t t
L P (t )dt = 2
0 0
d dt
L I (t ) dt = 2
2 0
d I 2 (t ) =
t
L 2 I (t) 2
(6.63)
P (t )dt = Em (t)
The last to equations combined provide the nal result of the magnetic energy of the coil: 1 Em = LI 2 2 95 (6.64)
The magnetic energy density (m ) in the coil is the ratio of the energy over the volume (Al ) : Em 1 0 r N 2 2 1 m = = I = 0 r Al 2 l2 2 The following two formulas are well-known: NI =H l 1 m = HB 2 0 r H = B (6.67) NI l
2
(6.66)
By means of these, the magnetic energy density in the coil comes out. 1 m = HB 2 W m3 (6.68)
Though this result was deduced for the specic condition of a solenoid coil, the formula for energy density is universally valid for any condition and geometry. In general case the dot product of the magnetic eld and the magnetic induction provides the result.
6.3
In the table below the Maxwell equations are summarized along with some auxiliary relations between the eld parameters. Maxwell 1. Amperes law This expresses the fact that the magnetic force lines do not have starting and nal points, much rather they are closing into themselves like closed loops. The rotation of the magnetic force lines is determined by the sum of the conductive current and the displacement current. The displacement current also generates magnetic eld without any electric conduction. This makes possible the electromagnetic wave propagation in the space. Maxwell 2. Faraday induction law This law states that the changing magnetic eld generates circulating electric eld around itself. The direction of the circulation is opposite of the right hand screw direction. In absence of magnetic eld the electric eld is circulation free, thus scalar potential can be introduced. Maxwell 3. Magnetic Gauss law This equation declares that magnetic monopoles do not exist, so the magnetic ux to any closed surface is zero. 96
Maxwell 4. Gauss law This law declares that the electric force lines start on the positive charge and end on the negative charge. The electric ux of on a closed surface is proportional to the amount of the contained free charges and it is zero if the charges are out of the closed volume. Stokes and Gauss Ostrogradsky theorems provide the conversion between the dierential and integral laws. Equation number Maxwell 1. Amperes law Dierential form D rotH = j + t Integral form Hdr = Icond +
g
dD dt
D =
S
DdA
B Maxwell 2. rotE = BdA t g S Faraday induc 0 0 = c2 B = 0 r H tion law Stokes theorem Conversion vdr = rotv dA g S
0 0 = c2 D = 0 r E B Edr = d B = dt
Maxwell 3 div B = 0 BdA = 0 V Magnetic Gauss law Maxwell 4. div D = f ree DdA = Qf ree V Gauss law Gauss Ostrograd- Conversion vdA = div v dV S V sky theorem
97
In the previous semester we have already discussed mechanical oscillators. Although the physical processes in an electrical oscillator are very dierent from those in a mechanical oscillator, the equations describing them take similar forms, and result in similar behaviour. The simplest electrical oscillator consists of an ideal inductor and capacitor. The voltage of the inductor is: UL = L dI dt (7.1)
where I is the current running through the inductor, and L is its self-inductance, respectively. The voltage of the capacitor is proportional to the charge accumulated in it: UC = Q 1 = C C Idt (7.2)
where Q is the accumulated charge in the capacitor, C is its capacitance and I is the current running into it. A simple circuit consisting of only these two elements can be described using Kirchos second rule: Ui = 0
i
(7.3) (7.4)
dI Q L + =0 dt C
98
d2 I 1 dQ L 2 + =0 dt C dt d2 I 1 L 2 + I=0 dt C
(7.5) (7.6)
d2 x (7.6) has the same form, as the equation, describing the undamped mechanical oscillator(m 2 + dt 1 Dx = 0), with x replaced by I , m replaced by L and D replaced by , respectively. As C the two equations are identical, so are their solutions: I (t) = I0 cos(0 t + 0 ) where 0 = (7.7)
1 , and I0 and 0 are determined by the initial conditions. LC In case of a mechanical oscillator the viscosity of the medium in which the oscillator is moving caused damping. In case of an electrical oscillator the ohmic resistance of the electrical components has a similar eect. A circuit consisting of an inductor, a resistor and a capacitor connected in a loop behaves as a damped oscillator: Ui = 0
i
d2 I dI 2 + 2 + 0 I=0 dt2 dt The equation has three dierent solutions depending on 0 and :
In the underdamped case (0 > )
(7.12)
(7.13)
Ic (t) = (A + Bt)et 99
(7.14)
Io (t) = I1 e1 t + I2 e2 t
(7.15)
Damped electrical oscillators lose energy due to losses on the ohmic resistance of electrical components. Even in the underdamped case the amplitude of the oscillation decreases exponentially, and the frequency is lower than that of the undamped oscillator. If the ohmic resistance (the damping) is increased, the period of the oscillation also 1 R = , the period reaches innity: even a single cycle of the increases. When 2L LC oscillation would take an innitely long time. For higher ohmic resistances, no oscillations are possible. Since all practical electronic components have some level of ohmic resistance, all practical electrical oscillators behaves as damped oscillators. Even if the damping is weak, the oscillation continuously decays due to ohmic losses. Similarly to mechanical oscillators they require external power to function for extended periods of time. This can be achieved by connecting a voltage source in the circuit: Q dI + RI + = U0 sin(f t) (7.16) dt C d2 Q dQ 1 L 2 +R + Q = U0 sin(f t) (7.17) dt dt C 1 U0 d2 I R dI + + Q= sin(f t) (7.18) 2 dt L dt LC L This equation is identical to the equation describing the forced mechanical oscillator, and its solution also takes the same form. The solution is the sum of the general solution of the homogenous equation plus a particular solution of the inhomogeneous equation. Similarly to mechanical oscillators, the homogenous solution decays exponentially, and after a short transient only the inhomogenous solution remains: L Q(t) = Q0 cos(f t + ) Where Q0 = U0 /L
2 2 2 2 (0 f ) + 4 2 f
(7.19)
(7.20) f R/L 2 2 0 f
tg () =
(7.21)
The charge oscillating in the circuit is maximal when the angular frequency of the external driving is Q =
2 0 2 2
(7.22)
100
As the voltage of the capacitor is proportional to the charge stored in it, this is called voltage or charge resonance. The current is: I= dQ = Q0 f sin(f t + ) dt (7.23)
The amplitude of the current becomes maximal, when the frequency of the external driving force is equal to the natural frequency of the circuit. This is called current resonance.
7.2
As we have seen in the previous semester disturbances in an elastic medium may generate mechanical waves. In a similar manner disturbances of the electric and magnetic elds may generate electromagnetic waves. But unlike mechanical disturbances electric and magnetic elds can exist in perfect vacuum, therefore electromagnetic waves do not require the presence of any medium. Applying the Maxwell equations to perfect vacuum shows that changes in the electric eld can generate a changing magnetic eld, which in turn generates a changing electric led. From Faradays law of induction: B t B rot(rotE) = rot t 0 H rot(rotE) = rot t rot(rotE) = 0 rotH t rotE = (7.24) (7.25) (7.26) (7.27)
In perfect vacuum there are no free charges, and consequently j = 0. Thus the rst term on the right-hand side of Amperes law disappears: rotH = j + Substituting this to (7.27) gives: 2D rot(rotE) = 0 2 t 2E rot(rotE) = 0 0 2 t (7.29) (7.30) D D = t t (7.28)
101
Using the mathematical identity rot(rotE) = grad(div E) E grad(div E) E = 0 In perfect vacuum the charge density is 0, therefore: div D = = 0 0 div E = 0 Substituting this into (7.31), the rst term on the left-hand side disappears: E = 0 2E 0 t2 (7.34) (7.32) (7.33) 2E 0 t2 (7.31)
This is a wave equation: the second dierential of the electric eld with respect to the space coordinate is proportional to the second dierential with respect to time, respectively. The velocity of the wave may be determined by substituting a wave function into (7.34). (7.35) represents a plane wave: E = E0 ej (tkr) Substituting this into (7.34) gives: kkE0 ej (tkr) = 2 0 0 E0 ej (tkr) k 2 = 2 0 0 1 c= = k 0 0 (7.36) (7.37) (7.38) (7.35)
Surprisingly this is exactly the same as the speed of light in vacuum. As scientists have already known that light exhibits wavelike behaviour, Maxwell and James Clark have suggested in an 1865 paper that light may be a form of electromagnetic waves. Since then the hypothesis was proven by many experiments. Today we know that visible light along with radio waves, microwaves, X-rays and gamma radiation is indeed a form of electromagnetic waves. The lowest frequency (longest wavelength) electromagnetic waves are usually referred to as radio waves. Their wavelengths are typically several meters and their frequencies ranges up to a few hundred megahertz. Microwaves have frequencies in the gigahertz range and their wavelengths are usually a few centimetres or millimetres. Infrared radiation has a typical wavelength of a few microns, and its frequency is in the terahertz range. Visible light is a very narrow band in the electromagnetic spectrum: our eyes are capable of detecting electromagnetic waves whose wavelength is between 400 (violet) and 750 nm (red). If the wavelength is shorter than 400 nm, it becomes invisible to human eyes, and it is usually referred to as ultraviolet radiation. X-rays have a typical 102
wavelength of a few nanometres or only a few angstroms. Electromagnetic waves with wavelengths shorter than 102 nm are called gamma-rays. Radio waves and microwaves are usually generated by accelerating charged particles (like the electrons is a radio antenna, or in the magnetron of a microwave oven). Microwaves and infrared radiation may also be created by molecular rotations. Most of the thermal radiation of room temperature bodies also appears in the infrared regime. Nearinfrared radiation (the section of the infrared spectrum that is closest to visible light) and visible light may be the result of molecular vibrations. Visible light and ultraviolet radiation may also be created by electronic transitions in molecules and atoms. X-rays are usually generated by high energy electronic transitions in atoms, or by accelerating (or decelerating) high energy charged particles (bremsstrahlung). The highest frequency electromagnetic waves, the so called gamma-rays are related to nuclear processes.
7.3
In the previous section we have assumed that both j and is zero. This is certainly true for perfect vacuum, but in the presence of a medium electric and magnetic elds may induce currents and polarise the medium, thus the deduction above loses its validity. However in non-conductive media (such as glass, or dierent types of polymers) the assumption that the current density is zero still holds and polarisation can be taken into account trough the dielectric constant ad the permeability. In isotropic media: D = 0 rE = E B = 0 r H = H (7.39) (7.40)
Using these formulas, the deduction is very similar to what we have seen in the previous section. The only dierence is that 0 and 0 is replaced by and . The wave equation becomes: E = And the speed of light in the medium is: 1 c= (7.42) 2E t2 (7.41)
This speed is always lower than the speed of light in vacuum. The ratio of the speed of light in a medium to the speed of light in vacuum is called the index of refraction of the medium:
103
n=
cmedium cvacuum
1 = 1 0
1 = r
0
(7.43)
r
7.4
The magnetic eld may be determined by substituting (7.35) into Faradays law of induction: B t B E= t B j k E0 ej (tkr) = t 1 B= kE rotE = (7.44) (7.45) (7.46) (7.47)
Therefore in free space the magnetic eld is perpendicular to both the electric eld and also to the wavenumber vector (k). In a similar manner it can also be shown, that E is also perpendicular to the B and k vectors. (It can be proven that this is true not only in free space, but also in any isotropic media.) As the wave is traveling in the direction of k both the E and B elds are at right angles to the direction of propagation. In other words electromagnetic waves are transverse waves.
7.5
Pointing Vector
Waves transport energy and momentum. The respective energy densities of the electric and magnetic elds are: 1 2 0E 2 1 2 uB = B 20 uE = (7.48) (7.49)
The amount of energy carried by the wave trough a given dA crossection in dt time is: U = (uE + uB )dAcdt (7.50)
104
The energy transfer trough a unit of area in a unit of time is: (uE + uB )dAcdt dAdt 1 1 2 S = ( 0 E2 + B )c 2 20 S= (7.51) (7.52)
According to (7.47) the intensity of the electric and magnetic elds are proportional to each other |B| = 1 |k E| (7.53)
If k and E are perpendicular to each other |k| |E| 1 |B| = |E| c |B| = Substituting this into (7.52) 1 1 S = ( 0 c |E| |B| + |E| |B|)c 2 0 c |E| |B| 1 S= ( 0 c2 + ) 2 0 |E| |B| S= 0 (7.56) (7.57) (7.58) (7.54) (7.55)
We also know, that the wave is traveling in the direction of the wavenumber vector, and this vector is perpendicular to both E and B. Therefore energy is transported in the direction of E B. With this knowledge we may introduce the Pointing vector, whose magnitude gives the amount of energy transported through a unit of area in a unit of time, and points in the direction of the energy transport: S= 1 EB 0 (7.59)
7.6
Light-pressure
Besides energy, a wave may also transport momentum. When it is reected back from a surface its momentum changes direction. As momentum should be conserved this is only possible, if there is a momentum transfer between the wave and the surface. In other words the wave exerts a force on the surface from which it is reected. By calculating this force we may determine the momentum carried by the wave. 105
So far we have considered electromagnetic waves traveling in perfect vacuum and in non-conductive media. But electric and magnetic elds can interact only with charged particles, such as electrons in a metal. It is through these charged particles that the wave can exert a force on the surface: it may inuence the movement of the charged particles, then the particles may transfer the momentum received from the wave to the rest of the medium. Therfore to calculate the force we have to describe the propagation of electromagnetic waves in conductive media. Let us consider a wave that arrives perpendicularly to a metal surface. We already know that the electric and magnetic elds of the wave are perpendicular to each other, and also to the direction of propagation. Let us choose the axis of our coordinate system so that the wave is traveling I the positive x direction, E is parallel to the y axis, and B is pointing in the direction of the z axis. In this case the intensities of the electric and magnetic elds on the surface are: E = (E0 sin(t))uy B = (B0 sin(t))uz (7.60) (7.61)
where uy and uz are unit vectors pointing in the direction of the y and z axes, respectively. The force exerted on charged particles by the electric eld is parallel to E and forces them to start oscillating in the y direction. But the amplitude of these oscillations and the velocity of the particles cannot increase to innity due to collisions with other particles in the solid. These losses act as if the charged particles are moving in a viscous medium. The velocity increases until the drag force becomes equal to the force exerted on the particles by the electric eld. FE = (qE0 sin(t))uy FD = k vy (7.62) (7.63)
where q is the charge of the particles, and k is a constant reperesenting the viscosity of the medium in which the particles are moving. Comparing the two equations gives: vy = ( qE0 sin(t))uy k (7.64)
The magnetic eld can also exert a force on moving charged particles: FB = q vy B qE0 FB = q ( sin(t))uy (B0 sin(t))uz k q 2 E0 B0 FB = ( sin2 (t))ux k (7.65) (7.66) (7.67)
Unlike the force exerted on charged particles by the electric eld, which changes sign twice every cycle, and averages out in a longer period of time, this force always points 106
toward the positive x direction. When averaged over a longer period of time (compared to the period), it results in a net force perpendicular to the surface. This is the force representing the momentum transfer between the electromagnetic wave and the surface. The energy transferred to a particle in a unit of time is: dU = (FE + FB ) v dt (7.68)
As the force exerted on the particles by the magnetic eld is perpendicular to their velocity it can be ignored: qE0 dU = (qE0 sin(t))uy ( sin(t))uy dt k 2 dU q 2 E0 q 2 E0 B0 = sin2 (t) = c sin2 (t) dt k k (7.69) (7.70)
Comparing equations (7.67) and (7.70) shows that the power transferred to the particle is proportional to the force exerted on it by the magnetic eld: |FB | = 1 dU c dt (7.71)
(7.71) show that the momentum transfer in a unit of time on a unit of surface area is proportional to the energy transfer in the same amount of time on the same area. We already know that the amount of energy carried by the electromagnetic wave is given by the Pointing vector. This means, that the momentum of the wave is proportional to the Pointing vector: S dp = dAdt c (7.72)
The light pressure is the force with which the light acts on a unit area of the surface. When light is adsorbed, it transfers all of its momentum to the surface. In this case the light-pressure is: Pabs = dF dp |S| = = dA dAdt c (7.73)
When light is reected back from a surface, its momentum changes sign. This means, that the momentum transfer between the electromagnetic wave and the surface is twice as intense as in case of absorption. Pref l = 2 |S| c (7.74)
107
7.7
Skin depth
In section 7.2 we have deduced the existence of electromagnetic waves in perfect vacuum. But as we have seen in the previous section electromagnetic waves can transfer part of their energy and momentum to charged particles. Due to these losses the intensity of the waves decreases in conductive media. To take this into account the simplied model of section 7.2 needs to be augmented: B t B rot(rotE) = rot t H rot(rotE) = rot t rot(rotE) = rotH t According to the third Maxwell equation: rotE = rotH = j + (7.75) (7.76) (7.77) (7.78)
D (7.79) t In perfect vacuum, the rst term on the left-hand side could be ignored. But in conductive media the electric eld forces charged particles to move, therefore j = E (where is the conductivity of the medium) rotH = E + Substituting this into (7.78): rot(rotE) = 2E E t2 t E 2E 2 t t (7.81) E t (7.80)
Using the mathematical identity rot(rotE) = grad(div E) E grad(div E) E = If there is no space-charge: div D = = 0 div E = 0 Substituting this into (7.82), the rst term on the left-hand side disappears: 2E E E = + 2 t t 108 (7.85) (7.83) (7.84) (7.82)
The equation is very similar to (7.34), except of the last term, which represent losses due to the energy transferred to the conductive medium. Substituting the wave function of a plane wave into (7.85) as a trial function gives: E = E0 ej (tkr) kkE0 e
j (tkr)
= E0 e
j (tkr)
k2 = 2 + j |k| = 2 + j
For metals, the rst term on the right-hand side is negligible compared to the second term: |k| = |k| = j (7.90) (7.91) (7.92)
(7.93)
Therefore k have an imaginary component. (The situation is similar to the underdamped case of the damped oscillator, where the imaginary component of the angular frequency caused an exponential decay of the amplitude.) Substituting (7.93) back into (7.35) gives: E = E0 ej (tkr) E = E0 e
j (t(kr +j kim )r)
E = E0 ekim r ej (tkr r)
Therefore the amplitude of electromagnetic waves decay exponentially in conductive media. Due to the ekim r term the rate of this decay depends on kim = . The 2 distance in which the amplitude of the wave decreases by a factor of e is called skin depth: = 1 = kim 2 (7.97)
This result helps us to choose suitable materials for electromagnetic shielding. Skin depth depends not only on the conductivity of the medium, but also on its magnetic 109
permeability. Although copper and aluminium are good conductors, their permeabilities are low, therefore (contrary to common belief) they are not ideal for electromagnetic shielding. Several dierent alloys (such as permalloy and mu metal) have been developed for this purpose. Although they are not as good conductors as copper, their permeability is several orders of magnitude higher, which makes them much more suitable for shielding. Another interesting property of skin debt is that it depends not only on the properties of the medium, but also on the frequency of the electromagnetic wave. High frequency waves transfer their energy to the medium very quickly, and they have very small skin depth. As the frequency of visible light is very high, its skin depth is very small. This is the reason why metals are not transparent. At lower frequencies electromagnetic waves may penetrate deep into the medium. This also means that shielding against low frequency electromagnetic interference is considerably harder. This is also the reason why nuclear submarines use very low frequency radio waves to communicate with their command centre: seawater is a conductive medium 1 , and it adsorbs radio waves. By using lower frequencies skin depth may be increased and the submarine may stay in contact with its command centre even while it is submerged. (It must be noted however that low frequency communication has a very limited bandwith, therefore such communication channels are used only to transfer the most crucial commands...)
7.8
In the previous section we have concluded that visible light cannot penetrate deep into conductive media. But we know from practical experience that there are transparent materials such as glass and dierent types of polymers. (A common characteristic of these materials is that they are non-conductive.) In this section we shall discuss the behaviour of electromagnetic radiation at the interface between two such non-conductive materials. Imagine an electromagnetic wave arriving to the boundary. Part of it may be reected back from the interface, the rest will be transmitted. Therefore we shall consider three electromagnetic waves: the incident wave (whose electric eld is Ei ), the reected wave
It must be noted, that although water is a conductive medium, it is transparent for visible light. The reason of this is that in water it is not electrons that conduct electric currents but the ions of dierent kinds of salts. These ions are considerably less mobile the electrons in metals. Therefore they are unable to follow the quick changes of high frequency electromagnetic waves. Although water is perfectly capable to adsorb radio waves the above described mechanism does not work at higher frequencies. (It must be noted however that other regimes of the electromagnetic spectrum may be adsorbed by other mechanisms.)
1
110
Et = Et0 e
At the boundary between the two media all three waves must exist simultaneously and the tangential components shall be equal on both sides. In mathematical terms: n Ei + n Er = n Et Where n is the normal vector of the interface. n Ei0 ej (ki ri t) + n Er0 ej (kr rr t) = n Et0 ej (kt rt t) (7.102) (7.101)
The equation above holds for all moments of time at all points of the interface only if: ki r i t = kr r r t = kt r t t (7.103)
This means that the phases of all three waves should be the same at the interface at all moments of time. This also means that it should hold for r = 0: i t = r t = t t i = r = t (7.104) (7.105)
Therefore the frequencies of all three waves are the same. Reection and refraction may change the direction of propagation (and in case of refraction even the wavelength may change), but the frequency always remains the same. We may also use (7.103) to determine the direction into which the wave is reected or refracted. As the equation is true for all moments of time, it should also hold for t = 0: ki r = kr r = kt r From ki r = kr r: ki rsini = kr rsinr (7.107) (7.106)
Both the incident and the reected wave travels in the same medium, therfore their velocities are identical. Since their frequency is also the same, ki = kr . Therfore: sini = sinr i = r 111 (7.108) (7.109)
This means, that reection is symmetrical: the angle between the incident wave and the normal of the interface is the same as the angle between the reected wave and the normal. This is called the law of reection. The principle is utilised in a wide range of optical systems from telescopes to the headlights of cars. Form ki r = kt r: ki rsini = kt rsint ki vi n1 c sint = = = sini kt vt n2 c sint n1 = sini n2 (7.110) (7.111) (7.112)
where n1 is the index of refraction of the rst medium (from which the wave arrives to the interface) and n2 is the index of refraction of the second medium (in which the transmitted wave travels). (7.112) is called the law of refraction or Snells law. It shows that electromagnetic waves cannot travel through the interface without changing their direction of propagation. This phenomenon has widespread applications is optics. Lenses, optical bres, and prisms are all based on this phenomenon. Even rainbows appear due to the fact that the index of refraction depends on the wavelength of light, therefore dierent colours are refracted into dierent directions by the tiny drops of water oating in air after a rain.
112
8.1
In the previous chapter we have deduced the law of refraction, also known as Snells law: (8.1)
113
Where the so called incident angle (i ) is the angle between the incident ray of light and the normal of the surface, and the refraction angle (r ) is the angle between the refracted ray of light and the normal. The refractive index of the medium from which the incident ray of light arrives to the interface is n1 , and n2 is the refractive index of the other medium. The equation shows that when light is traveling from an optically denser material to an optically rarer material (n1 > n2 ) the refraction angle is higher than the incident angle. This means that there is a critical incident angle (which is smaller than 90 degrees), at which the refraction angle reaches 90 degrees. Since in this case light cannot pass through the interface, it is completely reected back to the optically denser medium following the law of reection. This is called total internal reection. The phenomenon has several practical applications. Many optical systems apply prisms instead of mirrors. A prism is basically a piece of glass with at polished surfaces. Light may freely enter the prism trough a side that is arranged to be perpendicular to its trajectory. The prism is cut in such a way that the ray of light reaches the second side of the prism at an angle that is higher than the aforementioned critical angle thus it undergoes total internal reection, and continues its trajectory as if it was reected on a mirror. A third side of the prism is cut in an angle that is perpendicular to the new direction of the ray, so that it can leave the prism. A similar principle can be used to guide light to large distances. For example a simple optical bre consists of a transparent core, surrounded by a cladding material with a lower index of refraction. The end of the bre is usually perpendicular to its axis, so light can enter the core easily from the axial direction. These rays of light will reach the interface between the core and the cladding at a at angle. Due to total internal reection they bounce back from the interface and continue their trajectory inside the core. Since reection is symmetrical, the new direction is also close to the axial direction, and the ray reaches the interface at at angles over and over again and bounces back to the core, without ever leaving it. This way light can be guided to large distances inside to core of the bre.
8.2
Spherical Mirror
From the headlights of cars to astronomical telescopes many dierent optical systems contain curved mirrors. Their exact shapes may vary depending on the application, but the simplest type is the so-called spherical mirror. It is relatively easy to describe its behaviour in mathematical terms, and the principles we may deduce for it may be applicable to other types of curved mirrors, too. To describe spherical mirrors, we have to consider three distinct cases. The rst one is depicted in gure 8.1. Point C marks the centre of the sphere, while M is the centre of the mirror. The line drawn trough these points is called the optical axis. Imagine that an object on the optical axis (in point O) emits a ray of light, which is reected back 114
from the mirror in point P . Let us calculate the position where this reected ray will cross the optical axis (point I )!
Figure 8.1: Reection on a concave mirror if the light source is farther away from the mirror than its radius The CP segment is the radius of the sphere thus it is perpendicular to the surface of the mirror. Therefore the incident and reected ray should be symmetrical to this line. In other words, the angle between the incident ray of light and the dashed line (the incident angle) is the same as the angle between the dashed line and the reected ray of light (reection angle): i = r = (8.2)
The angle is the exterior angle of both the OIP and CIP triangles. Since the exterior angle of a triangle is the sum of the two remote interior angles we may calculate from the angles of both triangles: = + 2 =+ We may eliminate by combining these two equations: + = 2 (8.5) (8.3) (8.4)
The angle may be determined from the M OP triangle. If it is small enough, point P is so close to point M , that the curvature of the mirror becomes negligible and the 115
h . But for small angles sin OM P angle is close to 90 degrees. In this case sin = OM h may be approximated by , therefore . Since this approximation is applicable OM only when the rays of light are almost parallel to the optical axis, it is usually referred to as paraxial approximation. The and angles may be estimated in a similar manner. From the P CM triangle h h , and from the P IM triangle . Substituting these into (8.5) gives: CM IM h h h + =2 OM IM CM 1 1 2 + = OM IM CM (8.6) (8.7)
An interesting feature of this equation is that it is independent of the angles of the rays (as long as they are suitably small). All rays of light emitted by the object (in a suitably small angle) will be focused to the same point: all reected rays will cross each other in point I, and continue their trajectories as if they have originated from that point. When we look at objects in the physical world, our eyes detect the light scattered on their surface. Basically all points of the object act like a light source. Therefore if we place an object in point O the mirror will project the rays of light scattered on its surface to point I , and they will continue their trajectory as if they have originated from that point. An observer looking into the optical system will perceive these reected rays, as if they have been scattered on the surface of an object in point I . In other words the concave spherical mirror projects an image of the object to point I .
Figure 8.2: Reection on a concave mirror if the light source is closer to the mirror than its radius 116
It must be noted, that the situation is slightly dierent if the object is closer to the mirror than its radius. This second case is depicted if gure 8.2. Again, the CP segment is the radius of the sphere, thus i = r = (8.8)
Again, the angle may be determined from both the OIP and CIP triangles. (But in this case is one of the internal angles of both triangles. The external angles are i + r = 2 and r = , respectively.) 2 = + =+ may be eliminated by combining the two equations: = 2 Using the paraxial approximation: 1 1 2 = OM IM CM (8.12) (8.11) (8.9) (8.10)
Figure 8.3: Reection on a convex mirror The third case to consider is when the object is placed in front of a convex mirror (gure 8.3). From the OP I and OP C triangles: 2 = + =+ 117 (8.13) (8.14)
Combining the two equations to eliminate gives: = 2 Using the paraxial approximation: 1 1 2 = OM IM CM (8.16) (8.15)
Note, that in the rst case the image appears in front of the mirror, while in the second 1 has and third case it is behind it. Consequently in equation (8.7) the second term IM a positive sign while in equation (8.12) and (8.16) it has a negative sign. Also in the rst two cases (where we have discussed concave mirrors) the right-hand side of the equation had a positive sign, while in the third case (where we have discussed the behaviour of a convex mirror) the right-hand side had a negative sign. Except of these, the three equations are identical. Therefore we may express all three cases by a single equation. 1 1 1 + = o i f (8.17)
The parameter o is called object distance. It is the distance of the object (point O) from the centre of the mirror (point M ). It has a positive sign if the object is in front of the mirror, and a negative sign if it is behind it. (The latter may occur when the mirror is part of a complex optical system, and the previous stage of the system projects the image of an object behind the mirror. In this case this image - which appears behind the mirror - serves as the object of the next projection.) The parameter i is called image distance. It is the distance of the image (point I ) from the centre of the mirror (point M ). It has a positive sign if the image is in front of the mirror, and a negative sign if it is behind it. The parameter f is called the focal distance of the mirror, because rays of light arriving to the mirror parallel to the optical axis will be focused to a point on the optical axis (the so called focal point) which is precisely f distance away from the centre of the mirror (point M ). (This can be imagined as if light is coming from an object innitely far away so that the divergence of the rays is negligible. In this case the object distance is innite, and its inverse is zero. Therefore the image distance becomes equal to the focal distance.) For spherical mirrors f is half of the radius of curvature. The sign of the focal distance is positive for concave mirrors, and negative for convex mirrors.
Equation (8.17) is called the projection law, and it is applicable not only for spherical, but for other types of mirrors, too. Although their shapes are dierent, in the paraxial approximation parabolic and hyperbolic mirrors behave in a similar manner. 118
8.3
Optical systems may contain lenses instead of curved mirrors. These are usually made of glass, and just like spherical or parabolic mirrors, they may be used to focus light, and project images of objects. Whereas in case of spherical mirrors light was focused by reection on a curved surface, in case of lenses light is refracted on the curved boundaries between air and glass. To describe the behaviour of lenses let us calculate how the direction of a ray of light changes when it enters the lens, and when it leaves it. For the sake of simplicity let us consider spherical surfaces yet again:
Figure 8.4: Refraction of light on the input surface of a spherical lens On gure 8.4 a ray of light originating from point O enters the lens in point P1 . Point C1 marks the centre of the spherical surface, thus the P1 C1 section is a radius of the sphere and it is perpendicular to the surface. The incident angle (i1 ) is the angle between the OP1 and P1 C1 lines, whereas the refraction angle (r1 ) is the angle between the perpendicular (P1 C1 line) and the refracted ray of light (P1 I1 line). The refractive indices of the outside medium and the glass are marked by n1 and n2 , respectively. The i1 angle is the external angle of the OP1 C1 triangle, therefore: i1 = 1 + 1 The 1 angle is the external angle of the P1 C1 I1 triangle, thus: 1 = r1 + 1 According to the law of refraction: n1 sin(i1 ) = n2 sin(r1 ) In the paraxial approximation: n1 i1 = n2 r1 119 (8.21) (8.20) (8.19) (8.18)
i1 and r1 may be expressed from equations (8.18) and (8.19). Substituting these into equation (8.21) gives: n1 1 + n1 1 = n2 1 n2 1 n1 1 + n2 1 = (n2 n1 )1 (8.22) (8.23)
The angles 1 , 1 and 1 may be estimated in the paraxial approximation in a similar fashion as in the previous section: h OM h 1 I1 M h 1 C1 M 1 Substituting these into equation (8.23) gives: n1 n2 n2 n1 + = OM I1 M C1 M Or: n1 n2 n2 n1 + = o i1 R1 (8.28) (8.27) (8.24) (8.25) (8.26)
where o is the object distance, R1 is the radius of the input surface of the lens, and i1 is the distance of point I1 (where the ray crosses the optical axis) from the centre of the lens (point M ). The only problem is, that actual lenses are usually thin, and therefore the ray leaves the lens before it could reach point I1 . During this, it undergoes a second refraction, that can be described in a similar fashion as the rst. On gure 8.5 the ray of light arrives to point P2 where it undergoes its second refraction. Whereas it would have crossed the optical axis in point I1 , after the second refraction it will head towards point I . Point C2 marks the centre of the spherical exit surface whose radius is R2 . As the C2 P2 segment is the radius of the sphere it is perpendicular to the surface. The incident angle (the angle between the incoming ray of light and the perpendicular) is marked by i2 , while the refraction angle is marked by r2 . These angles are the external angles of the C2 P2 I1 and C2 P2 I triangles, therefore: i2 = 2 + 2 r2 = 2 + 2 According to the law of refraction: n2 sin(i2 ) = n1 sin(r2 ) 120 (8.31) (8.29) (8.30)
Figure 8.5: Refraction of light on the exit surface of a spherical lens In the paraxial approximation: n2 i2 = n1 r2 Substituting i2 and r2 from (8.29) and (8.30) into this equation gives: n2 2 + n2 2 = n1 2 + n1 2 n1 2 n2 2 = (n2 n1 )2 (8.33) (8.34) (8.32)
The 2 , 2 and 2 angles may be estimated from the P2 I1 M , P2 IM and P2 C2 M triangles, respectively: n2 n2 n1 n1 = (8.35) IM I1 M C2 M Or: n1 n2 n2 n1 = (8.36) i i1 R2 By comparing equations (8.28) and (8.36), the i1 distance may be eliminated: n1 n1 + = (n2 n1 ) o i n1 n1 + = (n 1) o i where n = 1 1 + R1 R2 1 1 + R1 R2 (8.37) (8.38)
n2 . Comparing this equation to the projection law for spherical mirrors, it is n1 obvious, that the two are very similar to each other, and the focal distance of the lens is determined by the following formula: 1 = (n 1) f 121 1 1 + R1 R2 (8.39)
This equation is commonly referred to as the lensmakers equation. (Using this formula a lensmaker may estimate the necessary curvatures to form a lens of a given focal length.) The signs of the parameters in the equation depend on whether the surfaces are convex or concave. In this form of the equation both R1 and R2 are positive for convex surfaces and negative for concave surfaces. (The deduction for concave surfaces would be very similar to the one presented above, only the signs of certain terms would be dierent...) It must be noted that we have made several approximations during the above deduction, and simplied the problem considerably. This means that the equation in this form is valid only for thin lenses (where the curvature radii are considerably larger than the thickness of the lens). For thick lenses the formula must be augmented by an additional term: 1 = (n1) f where d is the thickness of the lens. 1 1 (n 1)d + R1 R2 nR1 R2 (8.40)
8.4
As we have seen in the previous sections the behaviour of spherical lenses and mirrors are very similar. Both can be described by the projections law: 1 1 1 + = o i f In general we may say that:
The object distance (o) is the distance of the object from the centre of the lens or mirror, and it has a positive sign if the rays of light heading towards the lens or mirror are divergent, and a negative sign if they are convergent. The image distance (i) is the distance of the image from the centre of the lens or mirror, and it has a positive sign if the rays leaving the lens or mirror are convergent, and a negative sign if they are divergent. The focal distance (f ) depends on the curvature radii. For spherical mirrors it is half of the radius of curvature, with a positive sign for concave mirrors, and a negative sign for convex mirrors. In case of lenses, the focal distance may be determined by the lensmakers equation. The inverse of the focal distance measured in metres is usually called dioptre, and it is commonly used by opticians to dene the optical power of prescription glasses.
(8.41)
122
Using the projection law we may calculate where the image of an object will appear. Alternatively we may use simple geometrical principles to follow certain characteristic rays of light originating from the object, and check where they cross each other to construct the image. These characteristic rays are:
A ray that arrives parallel to the optical axis will be reected back from the mirror or refracted by the lens so that it goes through the focal point. A ray arriving trough the focal point will be reected back from the mirror or refracted by the lens parallel to the optical axis. A ray arriving to the centre of the mirror will be reected back symmetrically to the optical axis. In case of a lens its direction will not be altered.
In the following examples we will construct the images formed by lenses, but the projections of mirrors may be constructed in an equally simple manner.
Figure 8.6: Construction of the image formed by a converging lens if the object is farther away from the lens than its focal distance
Figure 8.6 demonstrates the case where an object is placed in front of a convex lens, and the object distance is larger than the focal distance. The refracted rays of light are convergent: they cross each other on the other side of the lens. If we
123
were to place a screen to this position a clear image of the object would appear on it. Therefore this kind of image is known as a real image. It must also be noted that the AM B and CM D triangles are similar triangles: the lengths of their corresponding sides are proportional to each other. Based on this we may determine the magnication of the image from the ratio of the image distance and the object distance:
M=
AB AM i AB = = = DC CD CM o
(8.42)
where o is the object distance, and i is the image distance. The negative sign signies that the image stands upside down.
If the object is placed precisely into the focal point of a convex lens all rays of light originating from it will be refracted parallel to the optical axis, and no image will be formed. In gure 8.7 the object is placed closer to a convex lens than its focal point. In this case the rays originating from the object are divergent even after they were refracted by the lens. This means they will never cross each other, and no real image will be formed. (There is no position in the optical system, where a clear image would appear on a screen.)
It must be noted however that if we extend the refracted rays backwards, these extensions (marked by dotted lines on the gure) will cross each other behind the object. In other words for an observer looking into the optical system from the right it will seem like the refracted rays are originating from that point: the observer will see a so called virtual image behind the lens. The magnication can be calculated again by equation (8.42). Since in this case the image distance is negative, the two negative signs cancel out each other, and the magnication becomes positive. (The image stands upright.) As in this case the image always appears behind the object, the image distance is always larger than the object distance, and the image is always magnied.
If we place an object in front of a concave lens (also known as a diverging lens or negative lens), the refracted rays will be even more divergent than the original rays. Therefore the image is virtual. A diverging lens can never form a real image regardless of the position of the object. As the image distance is always negative for a virtual image, the magnication is always a positive number, therefore the image stands upright.
124
Figure 8.7: Construction of the image formed by a converging lens if the object is closer than its focal distance
125
8.5
Aberrations
In the deductions of the previous sections we have made several approximations. This means that although the resulting formulas are close to being true in the paraxial approximation, they are never exact. Also the shapes of mirrors and lenses can never be completely precise: there are always minor deviations from the ideal due to the inaccuracies of the manufacturing process. (The optical components may change their parameters even after manufacturing due to mechanical stress of temperature dierences.) The materials used to build the optical systems may scatter light or behave dierently at dierent wavelengths. All these deviations from the ideal behaviour are referred to as aberrations.
Chromatic aberration
Although during the discussion of lenses we have assumed that their index of refraction is a constant, in practice this is not true. The index of refraction of practical materials depends on the wavelength of light. Since the focal length changes with the index of refraction, lenses have slightly dierent focal distances for dierent wavelengths. The result is that we cannot get a clear image for all colours at the same time. For example if we set up the optical system to give a clear image in blue light, images in all other colours will by slightly blurred, and we will see coloured outline around objects on the image. There are two possible solutions to this problem. Reectors have no chromatic aberration, therefore by replacing lenses by mirrors, this aberration can be eliminated. However this is not always practical. In certain optical systems the use of lenses is advantageous. Therefore opticians have developed so called achromatic lenses or acromats, in which lenses of dierent materials are assembled together to form a compound lens. On their own, each of the components have chromatic aberration, but since their materials are dierent, their chromatic aberrations are also dierent. If designed properly these dierent aberrations of the components may cancel out each other, and the aberration of the compound lens can be minimised.
Spherical aberration
Although in the previous sections we have considered spherical reectors and refractors, it was mainly for the sake of simplicity. Our approximations are valid only for rays of light very close to the optical axis. It can be shown, that the focal distance of a spherical lens or mirror changes with the distance from its centre. (Marked by h on the gures.) This means, that a ray of light arriving to the edge of a spherical mirror will be focused to a dierent point than another which arrives close to the centre. In other words we can never get a clear image with a spherical lens or mirror. It can be proven that the exact shape of the mirror should be a paraboloid instead of a sphere segment to avoid this kind of aberration. Unlike a spherical mirror a 126
parabolic mirror can focus large diameter beams into a single point. (As long as the beam arrives parallel to the optical axis.)
Coma
Comatic aberration is an inherent property of parabolic mirrors. Although they can focus a wide beam of light into a single point, this is possible only if the beam arrives parallel to the optical axis. Rays arriving from o-axis directions will not be focused to the single point. The result is that images of objects that are not in the centre of the eld of view are going to be blurred. The image of an o-axis point source (such as a star which is not in the centre of the eld of view) is not a single point, but a wedge-shaped smear, resembling the coma of a comet, hence the name. The eect may be reduced by introducing appropriately shaped correction plates or correction lenses into the optical system (as in the case of Maksutov- or Schmidt telescopes), or by replacing some of the parabolic mirrors in the system by hyperbolic mirrors.
Field curvature
Photo-plates or digital image sensors, such as CCD or CMOS matrices are usually at. But most optical systems project the image of objects onto a slightly curved surface. (Imagine that an object is placed in front of an ideal lens. If the distance of the centre of the at sensor from the centre of the lens is exactly the image distance, the system will give a clear image in the vicinity of the optical axis. But o-axis parts of the at sensor are at a larger distance from the centre of the lens, therefore they are going to be out-of-focus.) In other words either the edges or the centre of the image is going to be slightly blurred if we use a at image sensor. The phenomenon is called Petzval eld curvature. It can be remedied by building appropriately curved image sensors. This was relatively easy to achieve with traditional photo-lms, as they were exible, and could be stretched to the appropriate shape. Modern semiconductor based sensors are too rigid and fragile for such techniques, but in some instances (such as in the case of the Kepler space telescope where a large eld of view was required) large mosaic-like image sensor arrays may be constructed to compensate for eld curvature. Also the lenses in modern cameras are designed to have larger focal distances for o-axis rays to minimise eld curvature.
Astigmatism
Practical optical systems never have perfect rotational symmetry. The curvature of lenses and mirrors is usually slightly stronger in one direction, and they their axis can never be perfectly aligned with the axis of the system either. This means 127
then lenses and mirrors have slightly dierent focal distances in two perpendicular directions (such as in the horizontal and vertical direction). If we place the screen or the image sensor into one of these focal planes, the image is going to be blurred in the other direction. There is no position where we could get an image which is perfectly clear in both directions. (Placing the sensor between the two focal planes smears the image in both directions. . . ) Such aberrations created by errors in the shape of lenses and mirrors, or by misalignments are usually referred to as astigmatism.
128
In the previous chapter we have discussed optical systems using geometrical optics, and ignored any wavelike behaviour. But as we have seen earlier, light is a form of electromagnetic waves, which means that in certain cases geometrical optics wont be sucient to describe its behaviour. The rst such experiment was presented by Thomas Young in 1802.
Figure 9.1: Illumination of the screen from a single slit (Image is not to scale) Young put a thin plate with two small holes on it in front of a light source, and observed the light passing through it on a screen placed a few metres behind the plate. When either one of the holes was covered, and light was allowed to pass through only one of the holes, the whole screen was illuminated (Figure 9.1). Based on geometrical optics, 129
one would expect, to get the same pattern with doubled intensity when both holes are uncovered. (The holes are so close to each other, that light passing through either one of them gives virtually the same distribution on the screen.) But the experiment showed that when light was allowed to pass through both holes simultaneously a pattern of bright and dark rings have appeared on the screen, and the intensity of the bright spot in the centre has not doubled but quadrupled. These results cannot be explained by geometrical optics. To give a proper description of the phenomenon we have to consider light as a wave once again.
Figure 9.2: When light is allowed to pass through both slits simultaneously (double slit experiment) a diraction pattern consisting of a series of bright and dark lines appears on the screen (Image is not to scale) Later Young has repeated his experiment with thin slits (hence the name: double slit experiment) instead of small circular holes, to increase intensity. In this case a series of bright and dark lines appear on the screen instead of rings, but the principle is the same. Electromagnetic waves leaving the source reach both slits, and pass through them. For an observer on the other side of the plate it seems like both slits are emitting electromagnetic waves of the same frequency and wavelength. To calculate the light intensity at a given point of the screen we have to consider the superposition of these two waves. The situation is very similar to the superposition of two harmonic oscillations of the same frequency. As we have seen in the previous semester the superposition of two such oscillations is another oscillation of the same frequency, whose complex amplitude is the sum of the complex amplitudes of the individual oscillations. This means, that the amplitude of the oscillation depends not only on the amplitudes of the individual oscillations, but also on their phase dierence. The amplitude is maximal when the oscillations are in phase (their phase dierence is 0 or integer times 2 ), and minimal if 130
they are in opposing phases (the phase dierence is , or 3 , or 5 , etc...). In case of the double slit experiment we get a bright area on the screen where the two waves reach it in the same phase (constructive interference), and a dark area where they meet in opposing phases (destructive interference). On gure 9.2 light coming from S1 has to travel a longer distance to reach point P, therefore it will be late compared to light coming from S2 . To calculate the phase dierence we have to determine this optical path dierence. Since the S1 S2 C triangle is similar to the P AB triangle the S1 S2 C angle is equal to . Therefore: r = dsin (9.1)
Each wavelength extra optical path increases the phase of a wave by 2 , therefore the phase dierence is: = 2 d r = 2 sin (9.2)
It must be noted, that the distance of the screen from the slits (D) is several orders of magnitude larger than the distance of the bright fringes from the centre of the screen y (y ), therefore the angle is very small. For small angles sin tan = , thus: D d dy = 2 sin 2 D (9.3)
The screen will be bright where the two waves meet in the same phase, thus their phase dierence is m2 , where m is an integer number. (Or in other words their optical path dierence is an integer times the wavelength.) Therefore the criterion of constructive interference is: d dy 2m = 2 sin 2 D y m = dsin d D (9.4) (9.5)
Based on equation (9.5) we may determine the position of the bright fringes on the screen: ym mD d (9.6)
Therefore the distance between the mth and (m + 1)th bright fringe is: y = ym+1 ym (m + 1)D mD d d D y d 131 (9.7) (9.8)
Thus the bright areas on the screen are equidistant. In a similar manner we may also determine the positions, where the screen will be dark. For this the two waves have to meat in opposing phases (their phase dierence must be (2m + 1) , where m is an integer number) to cancel out each other. Therefore the criterion of destructive interference is: dy d (2m + 1) = 2 sin 2 D 1 y m+ = dsin d 2 D (9.9) (9.10)
The phenomenon is usually referred to as diraction. The bright areas are called diraction lines or fringes and the integer number m is the order of the diraction. Based on the above description we may also understand why the intensity of the bright areas has quadrupled instead of just doubling. In case of constructive interference the two waves meet in the same phase, thus the amplitude doubles. But we have also deduced earlier that the intensity of light is proportional to the square of the amplitude. Thus, doubling the amplitude quadruples the intensity. This may seem to violate the principle of energy conservation (How can the intensity quadruple? Where is the extra light coming from?), but it does not. Dont forget, that other areas of the screen that were illuminated when light was allowed to pass through only one hole, became dark due to destructive interference. It is important to understand that neither constructive nor destructive interference can create or destroy energy. When we uncover the second slit the total amount of light reaching the screen doubles, but its distribution also changes. Diraction redirects some of the light from the dark areas to the bright ones without changing the total power that reaches the screen. Therefore diraction (and interference phenomena in general) does not violate the principle of energy conservation, it only changes the distribution of light.
9.2
Coherence
Equation (9.6) shows that the position of diraction lines depends not only on the distance of the slits from each other, and from the screen, but also on the wavelength of the light. This means that if we repeat the experiment using a light source of dierent wavelength, the position of the diraction lines also changes. From equation (9.6) we may determine the shift in the position of diraction lines due to a shift in wavelength: ym mD d (9.11)
The problem is that practical light sources are never completely monochromatic: their spectrum always includes dierent wavelengths. This means that diraction patterns 132
belonging to dierent wavelengths in the spectrum of the light source appear on the screen superimposed on each other. Because of the m multiplier on the right-hand side of equation (9.11) higher order diraction lines suer larger shifts in position due to the same shift in wavelength. This means that while the rst order diraction lines may be clearly resolved the 15th order bright line of one wavelength may very well coincide with the 16th order dark line of another wavelength. In other words higher order diraction lines get smeared more easily. The result is that only a nite number of lines are visible on the diraction pattern: the lower order lines are usually well resolved, but as the order of diraction increases the pattern gets more and more smeared and after a point the lines cannot be distinguished any more. O course the number of visible lines depends on the light source: the more monochromatic it is, the smaller the shifts in line positions will be, and the more lines will appear clearly resolved. The maximum optical path dierence at which interference can still be observed (the diraction lines are still resolved) is called the coherence length of the light source L= 2 c = nf n (9.12)
where c is the speed of light in vacuum, n is the index of refraction of the medium in c which the experiment is done (thus is the speed of light in the medium), f is the n frequency of the light, and is its wavelength. f and are the spectral width of the light source in frequency and in wavelength, respectively. But coherence means slightly more than just a monochromatic spectrum. Diraction (as any other interference phenomena) depends on the phase of the interfering waves. A more precise denition of coherence length states that it is the propagation distance over which a wave maintains its coherence, or in other words its phase remains predictable. One may appreciate that this denition is very similar to the previous one. Waves of dierent wavelengths complete a dierent number of oscillations while travelling the same distance, thus on arrival their phases will be dierent. The longer they have to travel and the larger the wavelength dierence is, the larger phase dierence they accumulate. The phase of waves emitted from a non-monochromatic source becomes less and less predictable the longer they need to travel, and the less monochromatic the source is. Coherence length is the distance over which phase remains predictable, or in other words diraction patterns of dierent wavelengths does not smear each other. Another way to characterise the light source is to give its coherence time. By denition coherence time is the time over which a wave might be considered coherent, or in other words it maintains a predictable phase. The coherence time can be calculated by dividing the coherence length by the velocity of propagation: = 2 1 = f c 133 (9.13)
It must be noted that the invention of lasers was a very important achievement in the history of science and technology, as they are the most coherent (and most monochromatic) man-made light sources. While the coherence lengths of traditional light sources extend only to a couple of wavelengths, the coherence length of even a simple multimode laser is in the range of dozens of centimetres, while that of single mode lasers may be hundreds of meters, and the coherence length of bre lasers may exceed a hundred kilometres!
9.3
A similar experiment may be carried out using not two, but multiple slits. For the sake of simplicity let us consider a triple slit experiment! According to gure 9.3 the optical path dierence, and the phase shift between waves origination from neighbouring slits may be calculated in a similar fashion as before: dy d = 2 sin 2 D (9.14)
Figure 9.3: In case of a triple slit experiment the position of the major peaks are the same as in the double slit experiment, but they are narrower and more intense. Also a minor peak appers between each pair of major ones. The main dierence is that in this case we have to consider the superposition of not two but three waves. Naturally, the resulting complex amplitude is going to be the sum of the complex amplitudes of all three waves. In the centre of the screen, all three 134
waves meet in phase, thus all three phasors point in the same direction (gure 9.4) and consequently the amplitude triples (compared to the amplitude of the wave coming from a single slit), while the intensity of the light increases ninefolds.
Figure 9.4: The waves originating from the three slits reach the centre of the screen in the same phase, thus the three phasors representing them point in the same direction. This triples the amplitude and increases the intensity ninefolds. As we move away from the centre of the screen, the optical path dierence (and the phase dierence) between the waves increases, and the intensity decreases. (The three waves are no longer in phase, thus the phasors do not point in the same direction.) The intensity becomes minimal, when the phase dierence reaches 2/3 or 120 . (Figure 9.5) In this case the three phasors cancel out each other, thus the amplitude and the intensity of light is zero.
Figure 9.5: When the phase dierence between the waves is 120 they cancel out each other.
135
If we move further away from the centre the phase dierence keeps increasing, and the three waves do not cancel out each other anymore, thus the intensity starts increasing again. We may detect a minor maximum, when the phase dierence reaches . (Figure 9.6) In this case two of the phasors point in one direction and third in the opposite direction. Thus the amplitude is the same as the amplitude of a single wave, and the intensity is equal to the intensity we may detect when the screen is illuminated only by a single slit.
Figure 9.6: When the phase dirence is 180 two phasors point in one direction, and the third in the opposite direction. Thus the amplitude and the intensity is the same as if the screen was illuminated through a single slit. Further away from the centre the intensity keeps decreasing again until the phase dierence reaches 4/3 or 240 . (Figure 9.7) In this case the three phasors cancel out each other yet again, and the intensity is zero. After this, the amplitude increases with the phase dierence, until at 2 phase dierence the phasors point in the same direction yet again (gure 9.4), thus the amplitude is three times the amplitude of a single wave, and the intensity is nine times higher than what we may detect when the screen is illuminated through a single slit. If we keep moving away from the centre this pattern repeats itself over and over again. Major maxima appear where the phase dierence between waves originating from neighbouring slits is an integer times 2 . (Or in other words the optical path dierence is an integer multiple of the wavelength.) Thus the position of these major maxima is the same as in case of the double slit experiment. Since in these cases the amplitude is thrice the amplitude of a single wave, these major maxima are nine times more intensive than the illumination from a single slit. Halfway between each major maxima pair is a minor maximum (where two phasors point in the same direction, and one in the opposite direction). The maxima are separated by dark regions, where the phase dierence is either 120 or 240 , and the three waves cancel out each other. 136
Figure 9.7: When the phase dierence reaches 240 , the three waves cancel out each other yet again. Following this scheme, it is easy to see, what happens when we increase the number of slits. Major maxima appear always at the same positions given by equation (9.6), where the phase dierence between waves originating from neighbouring slits is an integer times 2 . In these cases all phasors point in the same directions, thus the amplitude is proportional to the number of slits, and the intensity is proportional to its square. Between the major maxima, a number of minor maxima appear. While in case of a double slit experiment we get no minor maxima, in case of a triple slit experiment we get a single minor maximum between major ones. In case of a four slit experiment we get two minor lines, in a ve slit experiment, we get three, and so on. The number of minor lines between the major ones is always the number of slits minus two. It is also worth noting, that with the increase of the number of slits, not only the intensity of the major lines increases, but they are also getting narrower. Using a very large number of slits results in very narrow diraction lines (compared to the distance of the lines). Thus the lines are well separated. This, combined with the fact that line positions depend on the wavelength of light, makes a plate with a very large number of slits on it an ideal dispersive element. In other words it can be used to separate light of dierent wavelengths. The rst optical spectrometers used prisms to separate dierent wavelengths.1 But most modern optical spectrometers use so called diraction gratings for this purpose. These act like a plate with a large number of slits on it in the multi slit experiment. They diract dierent wavelengths in dierent directions. As these gratings have a very
The refractive index of glass depends on the wavelength of light. Thus dierent colours are refracted into slightly dierent directions when light enters or leaves the prism trough a face that is not perpendicular to its direction. Therefore a prism can be used to resolve white light to the colours of the rainbow as it was demonstrated by Newton in the late 1660s. (In fact, actual rainbows are created in a similar manner when sunlight is refracted by small raindrops oating in the air after a rain...)
1
137
Figure 9.8: The intensity of the major lines increases by the square of the number of slits. They also get narrower and a series of minor peaks apper between them.
138
large number of slits2 each diraction line is very narrow, thus the spectrum of light falling on the grating is well resolved. It must be noted however, that these diractive spectrometers may generate certain artefacts in the spectrum. (For example the rst order diraction line of a long wavelength may overlap with the second order diraction line of a shorter wavelength...)
9.4
Fraunhofer diraction
Experiments show that under certain circumstances diraction patterns may be obtained even if we use only a single slit. This kind of diraction phenomena is referred to as Fresnel- or Fraunhofer diraction depending on the divergence of the incident light. In case of Fresnel diraction the light source is relatively close to the slit, the incoming rays are divergent, and light reaches the slit in form of spherical waves. On the other hand if the light source is relatively far from the slit, the divergence of the incoming rays becomes negligible, and light reaches the slit as a plane wave. Such cases are referred to as Fraunhofer diraction.
Figure 9.9: Diraction on a single wide slit. The two phenomena are similar to each other, and for the sake of simplicity we will discuss only Fraunhofer diraction. In this case not only the incoming waves are plane
It must be noted, that modern optical gratings are often so called reective gratings, that are made by creating a very dense line pattern on a reective surface. Due to the pattern, light can be reected from some areas (that act like the slits in the traditional multislit experiment), and not from others. An observer looking at the reection of a light source will perceive it as if looking at it through a plate containing a large number of narrow slits.
2
139
waves, but the diracted waves, too. This means, that the screen should either be very far from the slit, or we should place a lens in front of the slit. (A lens focuses rays heading into the same direction into a single point on the screen which is placed into its focal plane. Thus the lens projects each diraction direction to a dierent point of the screen.) To understand the phenomena imagine that we divide the single slit to a large number of small segments. Each of these segments acts as a light source, emitting electromagnetic waves with a constant phase displacement from each other. The diraction pattern is produced by the interference of these waves, and just like in case of the multi slit experiment the amplitude at a given point of the screen may be calculated as sum of the phasors representing the complex amplitudes of the waves. From gure 9.9 the phase dierence between waves from neighbouring segments is: dasin da y dr = 2 2 (9.15) D Where a is the width of the slit, and da is the width of each segment. Since the phase dierence is constant, and the amplitudes are the same (the slit is illuminated homogenously, and the sizes of the segments are identical), the phasors representing the complex amplitudes of the waves form a circular arc. The angle of this arc is: d = 2 = d = 2 ay asin 2 D (9.16)
The total amplitude in the direction may be calculated from the OPQ triangle on gure 9.10: A = 2Rsin 2 (9.17)
The only unknown in equation (9.17) is the parameter R. This may be determined from the length of the circular arc, which is formed by the phasors representing the complex amplitudes of the waves. As the magnitude of these amplitudes is constant, the total length of the arc is also constant. (It does not depend on .) Let us mark the intensity in the centre of the screen by I0 , and the corresponding amplitude by A0 . Since all waves reach this point in the same phase, all phasors point in the same direction. Thus the length of the arc is equal to A0 . From this: A0 = R R = Substiuting this into equation (9.17) gives: A0 A = 2 sin 2 sin = A0 2 2 (9.19) A0 (9.18)
140
Figure 9.10: The phasors representing the complex amplitudes of waves originating from dierent segments of the slit from a circular arc. The intensity is proportional to the square of the amplitude, thus I = I0 sin 2 2 2
(9.20)
= 0, thus the criterion of destructive interference is: 2 = m 2 = 2m asin 2 = 2m asin = m (9.21) (9.22) (9.23) (9.24)
where m is an integer number. Note, that although equation (9.24) takes the same form, as equation (9.5), here a marks the width of the single slit, and not the distance of the slits. Also, in case of Fraunhofer diraction, the equation gives the position of the dark areas, whereas in case of the multislit experiment it gave the position of the bright areas. 141
The signicance of the phenomena is that it determines the maximal resolution achievable by an optical system. Every such system (let it be a telescope, a microscope or any other optical instrument) has an aperture where light enters the system. This aperture acts like the slit in the deduction above. Light passing through it will undergo diraction, changing its direction of propagation and smearing the image. Imagine that we have an optical system which projects the image of an object to a screen. Even if light is coming from a point source it can never be projected into a single point, instead it will produce a diraction pattern like the one on gure 9.9. Even the central maximum has a nite with, thus if two points of the object are too close to each other, the corresponding diraction patterns will overlap on the screen, and we wont be able to distinguish them. In other words even if all aberrations of the optical system are negligible, and everything is in perfect focus, the achievable resolution is limited by diraction. Therefore such optical systems are referred to as diraction limited optical systems. From the deduction above, it is obvious that the achievable resolution depends both on the wavelength of the light, and the size of the aperture (or entrance slit), through which light enters the optical system. The shorter the wavelength, and the larger the aperture, the better the resolution is going to be. According to the Rayleigh criterion two points on the image are well resolved, if the central maximum of the diraction pattern of one of them, is not closer the other, than the rst minimum of its diraction pattern. It must be noted, that in case of circular apertures equation (9.24) takes a slightly dierent form: Dsin = pm (9.25)
where D is the diameter of the aperture and p1 = 1.22, p2 = 2.233, p3 = 3.238, etc... Usually the diraction angles in practical optical systems are so small, that sin may be approximated by . Using these, the Rayleigh criterion may be stated in a mathematical form: = 1.22 D (9.26)
Thus the angular resolution of a diraction limited optical system is proportional to the wavelength of the light and inversely proportional to the diameter of the aperture. Based on equation (9.26), it is easy to understand why astronomers prefer large telescopes. Its not only because a larger main mirror may collect more light and produce a brighter image: the maximal resolution of the optical system also depends on the size. It must be noted, that while the brightness of the image depends on the area of the main mirror, the resolution depends on the distance between its most distant edges. This means, that if parts of the main mirror are missing, it decreases only the brightness of the image, and has little or no eect on the resolution (as long as the maximal distance of the remaining parts are the same, as in case of a full mirror). 142
This means, that in order to achieve a high resolution, we dont have to build a complete large mirror. It is enough to build its most distant sections, since the missing parts wont aect the resolution, only the brightness of the image. In practice this is realised by building several smaller telescopes, that can move together, and point to the same section of the sky. Light collected by these telescopes is united, thus making it possible for the telescopes to function as one large telescope. Such systems are usually referred to as astronomical interferometers, and the distance of the parts is called the baseline of the interferometer. The rst such system was the Michelson stellar interferometer. It had such a high resolution that in 1920 it made it possible to measure the diameter of distant stars for the rst time. Today many modern telescopes are capable to work in interferometric mode including the Keck Observatory in Hawaii, the Large Binocular Telescope in Arizona, our the Very Large Telescope, operated by the European Southern Observatory in Chile. All of these systems have several telescopes that may be linked together to function as an astronomical interferometer, greatly improving their resolution. It must also be noted that since the 1940s radio telescopes are also routinely used in interferometric mode. That is the reason why large radiotelescope systems usually have several smaller dishes instead of a single large one. The individual radio telescopes do not even have to be at the same location. Measurement data recorded by radio telescopes thousands of kilometres away from each other may be transmitted to a single centre for processing, forming in eect a single large radio telescope. The largest such system in operation today is the Very Long Baseline Array whose telescopes are located all over the United States, forming an astronomical interferometer with a maximal baseline of 8611 kilometres. But the Reyleigh criterion aects not only astronomy. Since the wavelength of visible light is several hundred nanometres, traditional optical microscopes usually have a resolution of only a few microns, or a few hundred nanometres at best.3 This is one of the reasons why scientists prefer electron microscopes. Electron beams accelerated to a suitably high energy have wavelengths several orders of magnitude smaller than visible light, making it possible for certain types of electron microscopes to achieve atomic resolution, where individual atoms of the sample are visible on the image. The Rayleigh criterion also poses a serious problem to the semiconductor industry. Since integrated circuits are manufactured using photolithography, the achievable resolution is also limited by diraction phenomena. Currently (2013) the most advanced
It must be noted that in microscopy the Rayleigh criterion is given in a slightly dierent form: 1.22 R= , where R is the smallest distance resolved by the microscope, N A = nm sin is called the NA numerical aperture of the lens, is the half-angle of the maximum cone of light that can enter the lens and nm is the refractive index of the medium surrounding the lens. Since the numerical aperture depends on the refractive index of the medium, spatial resolution may be improved by immersing the optical system in a refractive liquid medium (for example: water). Hence such systems are referred to as immersion optics, and they are widely used in modern semiconductor manufacturing technology.
3
143
semiconductor devices use 22 nm technology, which means that the smallest feature of the device is approximately 22 nm. These devices are manufactured using 193 nm UV light. The problem is that decreasing the wavelength of light to improve resolution has become increasingly harder in the past few years. Currently it is expected, that the next technological step (14 nm devices - coming into production in 2015) will be achievable using a more advanced version of the current technology, but the step after that (10 nm, est. 2017) will require a signicantly dierent manufacturing process. Due to the diculties posed by extreme ultraviolet lithography, currently it is expected that 10 nm technology will use the same 193 nm wavelength as todays technology, and the improvement in resolution will be made possible by other means, such as multiple patterning. In this case high resolution patterns are achieved by the overlapping of several lover resolution patterns. (The overlapping area might be considerably smaller than the resolution of each pattern, thus increasing the overall resolution. . . )
9.5
Thin layer interference is a type of interference that we have all encountered in our daily life. This is what makes soap bubbles, and oil spills after a rain sparkle in rainbow colours. The reason of the phenomena is that sunlight contains all colours of the rainbow, and reection from such thin layers depends on the ratio of the layer thickness and the wavelength of the light. A layer of a given thickness may reect one wavelength but not another, giving it a distinctive hue. Consequently, as the thickness of the layer changes from point to point, so does its hue. Since both oil spills and soap bubbles have varying thickness, they will sparkle in rainbow colours. To better understand the phenomena let us consider the reection of light from the wall of a soap bubble. (This wall is basically a very dilute aqueous solution, thus its index of refraction is similar to that of water.) Some of the light arriving to the air/water interface will be reected back, while the rest will be transmitted into the water. The transmitted light will reach the water/air interface on the other side, and again, some will be reected back, while the rest will be transmitted, and leave the wall of the bubble. The waves reected back from the air/water and from the water/air interface may both reach an observer, and interfere with each other. Depending on the phase dierence, the interference may either enhance the intensity (constructive interference), or the waves may cancel out each other (destructive interference). To determine the phase dierence, rst we have to determine the optical path dierence between the rays reected back from the two interfaces. On gure 9.11 the ray that enters the wall, and bounces back from the water/air interface travels from point A to B , and then to point C . Due to the symmetry of reection the AB distance is the same
144
Figure 9.11: Reection from a thin layer: Light may be reected back from the air/water or from the water/air interface. These waves may ineterfere with each other, and depending on their phase dierence either echnace of cancel out ech other. as the BC distance. This distance may be determined from the AEB triangle: cos = d d AB = AB cos (9.27)
where d is the thickness of the layer. Thus, the optical path4 is: l1 = 2nAB = 2nd cos (9.28)
where n is the index of refraction of the layer. While part of the light travels on the ABC path, the rest travels from point A to point D. Since CD section is perpendicular to the AD section, the ACD angle equals . From the ACD triangle: sin = AD AD = AC sin AC (9.29)
Due to symmetry reasons the AC section is twice as long as the AF section, whose length is the same as the length of the EB section, which may be determined from the EAB
Note that the optical path is the length of the trajectory multiplied by the refractive index of the medium. (So far we have ignored this, since the refractive index of air is very close to 1. But the refractive index of water is signicantly higher, thus in this case we have to take this into account.) The speed of light in a medium is smaller than in vacuum. This means that it requires more time to travel the same distance in water. But as we have seen at the end of chapter 7, the frequency of light does not change, when it enters the medium. This means, that a longer propagation time causes a proportionally higher phase change. This is taken into account by the n multiplier.
4
145
Substituting this back to equation (9.29) gives the optical path of the wave reected back from the air/water interface: l2 = AD = AC sin = 2d tgsin Thus the optical path dierence is: l = l1 l2 = 2nd 2d tgsin cos (9.33) (9.32)
tgsin 1 cos n
(9.34)
sin = sin n
(9.35)
(9.36) (9.37)
According to the Pythagoras theorem: sin2 + cos2 = 1 1 sin2 = cos2 Thus: cos2 l = 2nd = 2ndcos cos (9.39) (9.38)
Each wavelength optical path dierence is responsible to a 2 phase dierence, therefore the phase dierence due to the dierent optical path lengths is: l = 2 l 2ndcos = 2 146 (9.40)
But there is one more thing to consider. It is a known fact, that when light is reected back from the surface of an optically denser medium, it suers a phase shift. Therefore the wave which is reected back from the air/water interface suers a phase shift. But the other wave, which is reected back from the water/air interface does not suer such a phase shift. This causes an extra phase dierence between the two waves in top of the phase dierence due to their dierent optical path lengths. r = = l + r = 2 2ndcos + (9.41) (9.42)
When the phase dierence is 0, or 2 , or 4 , etc... the two waves meet in phase. Therefore the criterion of constructive interference is: 2 2ndcos + = 2m 1 2ndcos = m 2 (9.43) (9.44)
where m is an integer number. The criterion of destructive interference is: 2 2ndcos + = (2m + 1) 2ndcos = m (9.45) (9.46)
The criterion for constrictive and destructive interference may be determined in a similar fashion for transmission too. (See gure 9.12) In this case while one wave travels on the ABH path, the other on the ABCG path. Their optical path dierence is again: l = 2ndcos (9.47)
The only dierence is that in this case neither wave suers a phase shift at reection. Thus the criterion of constructive interference is: 2ndcos = m While the criterion for destructive interference is: 2ndcos = m 1 2 (9.49) (9.48)
Note, that the criteria for constructive and destructive interference for reection and for transmission are precisely the opposite of each other. When the reection is maximal, the transmission is minimal, and vice versa. Just like in case of diraction, interference 147
Figure 9.12: Transmission trough a thin layer cannot create or destroy energy: a destructive interference does not mean that the energy of light is lost. It is simply directed somewhere else. It must also be noted, that the above deduced criteria wold be the same if the thin layer would have lower index of refraction than its surroundings. (Such as an air gap between two pieces of glass) Just like above, one of the waves would suer a phase shift at reection. The only dierence is that in that case it would be the wave which travels through the layer. On the other hand, if the thin layer is sandwiched between a medium with lower index of refraction, and another one with higher index of refraction, the criteria for constructive and destructive interference would be reversed, since in that case either both, or neither of the waves would suer a phase shift at reection. Thin layer interference occurs not only on oil spills and soap bubbles: it also has widespread applications in optics. As we have seen above, a thin layer may reduce reection from a surface. Of course a single layer reduces reection only at wavelengths that satisfy equation (9.46) or (9.49), depending on the index of refraction of the substrate onto which the layer has been deposited. (Dont forget, that if the index of refraction of the substrate is higher than that of the layer, the criteria are reversed.) But it is also possible to design layer structures consisting of a large number of thin layers with carefully chosen thicknesses and refractive indices that can reduce reection in a wide band of wavelength. Such anti reection coatings are usually deposited to the surfaces of dierent optical components. (For example in case of photography the so called lens are eect is caused by reections on the surfaces of lenses inside the objective of the
148
camera. By depositing a carefully designed anti reection coating to these surfaces, the lens are eect may be greatly reduced.) Similar layers are deposited to prescription glasses, too. Laser mirrors also utilise thin layer interference. Traditional mirrors are usually produced by depositing a metal layer on a glass substrate. The problem is that the metal will always adsorb a part of the incident light, which may reduce the eciency of the optical system and also damage the mirror itself. But using thin layer interference it is possible to design a layer structure made entirely of dielectric materials that can eectively reect the laser light without adsorbing any of it. Carefully designed layer structures may be tailored to reect or transmit certain wavelengths or wavelength bands, while blocking others. These are usually referred to as interference lters. Thin layer interference may even be used for decorative purposes. When exposed to air, most metals starts oxidising, until a stable oxide layer forms on their surface and protects them from further oxidation. The thickness of this so called native oxide layer is usually very small (sometimes less than a nanometre), but it can be increased relatively simply by an electrochemical processes (the so called anodic oxidation) or a heat treatment in an oxidising atmosphere. Just like the colour reected back from a soap bubble depends on the thickness of its wall, so will the colour of the metal depend on the thickness of its oxide layer. Certain metals (such as aluminium) can be relatively easily coloured this way, without the use of any paint. Since aluminium oxide is very hard and very stable so called eloxed aluminium will be more durable, and maintain its colour much longer, than any paint.
149
Since Youngs diraction experiments in the early 19th century scientists knew that light is a wave. But all other types of waves required some sort of medium to exist. For example a ringing alarm clock may be silenced by placing it into a vacuum chamber, and pumping the air out. Without the presence of air sound waves cannot reach us, therefore we wont hear the alarm. But if there is a window on the chamber we will still see the clock. This means that light waves can travel through vacuum. 19th century scientists were left with a tough question: what is the medium through which light waves are travelling? What remains in the vacuum chamber, after air was pumped out? What is the medium lling the vacuum of space, and allowing sunlight and the light of distant stars to reach us? To answer the question scientists have postulated the existence of a medium called luminiferous (or lightbearing) aether. It must be noted however that even 19th century scientists have realised that this hypothetic medium should have rather peculiar properties. Aether must be omnipresent, as light is capable to propagate through vacuum. On the other hand it cannot be solid, since objects can move through it without any measurable resistance. This raises another dicult problem: certain physical phenomena (such as birefringence) cannot be explained by longitudinal waves, only by transverse waves. As we have already mentioned in the previous semester transverse waves can exist only in solids. But if aether is a solid, how could other solid objects (such as planets and other celestial bodies) pass through it without any resistance?1 As an attempt to resolve the contradiction Augustin Louis Cauchy suggested that
1
Thomas Young was not the rst scientist who attempted to describe light as a wave. Christiaan
150
aether may behave like a non-Newtonian uid. The viscosity of certain materials (such as a suspension of corn starch in water) depends on the shear rate: when we try to change their shape slowly they ow like a uid, but they resist quick deformations as if they were solid. Since the frequency of light waves is very high, it seemed to be possible, that aether may act as a solid at these high frequencies (thus enabling transverse waves), while ow virtually without any resistance at low frequencies. (It must be noted however, that it seemed to be hard to imagine a substance which has practically zero viscosity at low frequencies, yet it proves to be several orders of magnitude stier than any other material at high frequencies...) A further problem was that aether had to be massless, otherwise its gravity would inuence the orbit of planets. The aether hypothesis persisted even after Maxwell had given a full description of electrodynamics, which led to the realisation that light is a form of electromagnetic waves. These are basically oscillating electric and magnetic elds, and scientists believed that only separated positive and negative electrical charges may create such dipolar electric elds. Since electrical charge is an inextricable property of matter, this also suggested that light must propagate through some sort of medium.2 According to Maxwells equations electromagnetic waves travel at a constant velocity, which depends only on the magnetic permeability and electric permittivity of the medium. Since scientists believed that aether is the medium which carries electromagnetic waves, it seemed to be obvious that this velocity has to be measured with respect to the aether. This also meant that it could serve as a universal frame of reference: it should be possible to determine the velocity of any observer with respect to the omnipresent aether by measuring the speed of light. (If light is propagating through aether at a constant velocity, an observer moving with respect to aether should nd that the speed of light in his frame of reference is slightly altered by his own velocity with respect to aether.) The problem was that even the velocities of celestial objects are small compared to the speed of light. Thus instruments had to be very precise to be able to detect such small variations. The rst apparatus that had the required precision was constructed by Michelson and Morley in 1887. The idea was that the Earth is moving at a relatively high velocity around the Sun. (Its average orbital speed is 29.78km/s, which is only four orders of magnitude smaller than the speed of light.) As the Earth orbits the Sun the direction and magnitude of its velocity is changing continuously. This means that even if it happens to be stationary with respect to aether at a given day when we perform the experiment, this shall not last: as it continues to orbit the sun, its velocity has to change, and it cannot remain stationary. This means, that the speed of light should be a few km/s dierent in the
Huygens has already suggested this interpretation in the 17th century, but it was rejected by Newton because of the above described contradiction. 2 Although electromagnetic induction of electric elds in vacuum is not forbidden by Maxwells equations it cannot be directly demonstrated since all methods of detecting electric elds require the presence of electrically charged particles.
151
direction of Earths movement than in the perpendicular direction, and this dierence (and the direction in which it is detected) should change annually. To detect this difference Michelson constructed a device capable of comparing the speed of light in two perpendicular directions.
Figure 10.1: A simplied Michelson interferometer A simplied sketch of a Michelson interferometer is shown on gure 10.3. Light coming from the source (S ) is split into two beams by a half-silvered mirror B . (This is usually referred to as beam splitter.) Some of the light is transmitted towards mirror M1 , while the rest is reected back towards M2 . (The BM1 B and BM2 B paths, where the two beams travel separately are usually referred to as the arms of the interferometer.) After both beams are reected back from their respective mirrors they are recombined once again at the beam splitter: some of the light coming from M1 is reected back towards the observer (O), while some of the light reected back from M2 is transmitted by the beam splitter in the same direction. These two beams interfere with each other, and the intensity detected by the observer3 depends on their phase dierence: it is maximal
In practice the interference is usually observed on a screen. This makes it possible to see the interference of slightly o-axis rays (dotted line of gure 10.3). These rays have to travel a slightly longer distance, which proportionally increases the phase dierence between the interfering waves. The result is that the intensity on the screen varies periodically with the distance from the centre, forming a series of bright and dark rings that are called diraction fringes. (It is also possible to produce a line-pattern on the screen instead of circles by slightly tilting both mirrors.)
3
152
when they meet in phase, and minimal when they meet in opposing phases. The phase dierence in turn depends on the time light requires to travel along each arm of the interferometer. This means, that if the length of the arms are adjusted to be equal, the interferometer may be used to detect dierences in the speed of light in the direction of its two arms. The precision of the interferometer depends on the length of its arms. The longer the arms, the larger the phase dierence is going to be due to the same dierence in the speed of light. To improve precision Michelson and Morley constructed an interferometer, where light was reected back and forth several times along the arms, increasing the path to 11 metres. To reduce vibrations and temperature variation, the experiment was performed in a closed basement room. To further decrease vibrations the interferometer itself was assembled on a large sandstone block, which was oating in a pool of mercury. This also made it easy to turn the entire device. After a single push it would keep rotating for a relatively long time, making it possible to scan deviations in the speed of light in dierent directions without touching the interferometer. (Touching the device would create so much vibration, that it would make the measurement impossible. . . ) Before the start of the experiment the interferometer was adjusted until the phase dierence between the beams was zero, then it was given a small push, to start its rotation. This made it possible to observe how the diraction pattern changed as the device was slowly rotating. When one of its arms pointed in the direction of the movement of Earth the speed of light should have been altered in that direction. This could be detected as a shift in the diraction pattern. As the device was rotating, the arms would point alternately in the direction of Earths movement, or in the perpendicular direction, periodically shifting the diraction pattern. Michelson and Morley estimated that they might be able to detect if the pattern was shifted by one hundredth of a diraction fringe. At the same time the orbital velocity of the Earth around the Sun should result in a 0.4 fringe shift, which should be well detectable. But contrary to their expectations they have found the speed of light to be the same in all directions.4
It must be noted, that there could be another interpretation. If we try to measure the speed of sound, it proves to be the same in all directions, and show no annual variations, despite of Earths movement around the Sun. The reason is that our planet is carrying its own atmosphere with it, thus the velocity of Earth with respect to the air surrounding it is practically zero. It seemed to be possible that objects may drag aether with them, the same way as planets are carrying their atmosphere. This was referred to as the aether drag hypothesis. But complete aether drag is not compatible with the results of certain experiments (such as the Fizeau experiment) and astronomical observations (such as stellar parallax measurements) that were performed decades before the Michelson-Morley experiment, therefore this cannot be a valid interpretation of the results.
4
153
10.2
The result of these experiments proved to be very hard to explain. According to classical physics the movement of the observer should inuence the speed of light, but experiments have proven this expectation to be wrong. It was Einsteins special theory of relativity that gave a proper interpretation of these results in 1905. The principle of relativity was well known even in classical mechanics. It stated that all inertial frames of reference are equivalent: mechanical laws are the same in all inertial frames. For example an observer sitting on a train whose curtains are drawn cannot perform any mechanical experiments that could determine if the train is moving or not. But this principle would have been violated by the positive result of the MichelsonMorley experiment. It would have created a means to measure the velocity of an observer with respect to aether. Thus not all inertial frames of reference would be equivalent: there would be a special, privileged frame (the one which is at rest with respect to the aether) to which all velocities should be measured. The principle of relativity would be true for mechanics (as there are no mechanical experiments, through which inertial frames could be distinguished), but it would not apply to other disciplines of physics, such as electrodynamics. Einstein believed that this is unacceptable: there is only one physical world, and it is guided by one set of rules. There should be a symmetry to the laws of nature: the most basic principles (such as relativity) should apply to all elds of science. Einsteins idea was surprisingly simple and elegant: let us accept relativity as a general principle that applies to all of nature, not only mechanics. And let us also accept the result of the Michelson-Morley experiment, which shows that the movement of the observer does not inuence the speed of light. Einstein summarised his assumptions in two postulates: 1. All inertial observers are equivalent with respect to ALL natural phenomena. There are no special, privileged frames of reference. 2. The speed of light is the same in all inertial frames, irrespectively of the state of movement of the lightsource or the observer. To understand the consequences of these assumptions let us follow a simple thoughtexperiment. Let us consider two inertial frames of reference K and K . Let the x axis of both coordinate systems point in the same direction, and let K move in this direction at a constant u velocity with respect to K . For the sake of simplicity let us synchronise the clocks of two observers sitting in the origins of K and K when they pass each other. (Thus t0 = t0 = 0 when x0 = x0 = 0.) At the moment when the two origins coincide let us send a light pulse from that point to the positive x direction, and let us try to measure its velocity in both frames of reference. Velocity measurements are relatively straightforward: all we have to do is measure the time light requires to travel a given distance, and then calculate the ratio of the distance and the travel time. Again, for the 154
sake of simplicity let us pick a point on the x axis, and instruct the observers in both K and K to measure the time, light requires to reach this point. The observer in the K frame will nd, that the light pulse reaches this point at a t moment, and its position is x. Let us mark the time measured by the observer in K by t , and the coordinate of the point by x . According to classical mechanics there is a connection between the values measured in K and K , which is called Galilean transformation: x = x ut t =t Or: x = x + ut From this, the speed of light measured in K should be: c = x ut x x = = u=cu t t t (10.4) (10.3) (10.1) (10.2)
where c is the speed of light in K , and c is the speed of light in K . This means that if the Galilean transformation would be true, the speed of light should be dierent in dierent frames of reference, which would contradict Einsteins second postulate and the results of the experiments discussed in the previous section. This contradiction clearly shows that there is something wrong with the Galilean transformation. If we carry out the experiment in practice, both observers will measure the speed of light to be the same. From the viewpoint of an observer in K the fact that the other observer in K gets the same result to the speed of light would seem like he made some mistakes during his measurements. It would seem like the observer in K has measured either the distance or the time (or both) incorrectly, as if his instruments were not calibrated correctly. (Imagine, that u is three quarters of the speed of light. In this case, the observer in K would expect that the observer in K will measure the speed of light to be one quarter light speed. Instead, he reports that he have measured the speed of light to be the same as in K . From the viewpoint of the observer in K , this would seem as if the results of the other observer are o by a factor of four.) The observer in K may take this into account by multiplying every value measured by the observer in K by a constant factor: x = (x ut) (10.5)
But the observer in K would believe the same about the observer in K . He would think, that his own measurements are correct, and the measurements of the observer in K are the ones that are distorted. Just like his colleague in K , he may try to correct this 155
Figure 10.2: Observers in both K and K measures the speed of light to be the same x x (c = = ). This contradicst the sperdictions of classical physics, thus the Galilean t t transformation has to be modied. by multiplying every measurement made by the observer in K by a constant factor. But remember: the principle of relativity dictates that all inertial observers are equivalent. If one is thinking about the other that his measurements are o by a factor of , the other should think the same about him. Thus the observer in K should multiply the values measured by the observer in K by the same factor: x = (x + ut ) (10.6)
We may determine the value of form equation (10.5) and (10.6). Let us multiply the left hand side of (10.5) by the left hand side of (10.6) and the right hand side of (10.5) by the right hand side of (10.6): xx = 2 (x + ut )(x ut) = = = xx (x + ut )(x ut) xx (xx + ut x utx u2 tt ) 1 t t t t (1 + u u u2 ) x x xx (10.7) (10.8) (10.9) (10.10)
According to Einsteins second postulate the speed of light is the same in both frames of 156
reference, thus
(10.12)
(10.14)
Equations (10.13) and (10.14) are called Lorentz transformation and they replace the Galilean transformation in Einsteins special theory of relativity. Unlike Galilean transformation, Lorentz transformation ensures that the speed of light is the same in all inertial frames of reference. This means that we shall abandon Galilean transformation, and use Lorentz transformation instead, because it is the one, which complies with the results of our observations. It has to be noted however that this change has some very interesting consequences, as we will see in the following sections.
10.3
In classical physics (using Galilean transformation) the length of an object was the same in all frames of reference. In Einsteins theory of relativity this is no longer true. Imagine that we have a rod that is at rest in the K frame (which is moving at a constant u velocity with respect to the K frame in the positive x direction). Let us mark the length of the rod in the K frame by L = x2 x1 , and by L = x2 x1 in the K frame. According to
157
the Lorentz transformation the length of the rod in the K frame is: L = x2 x1 x2 ut L = u2 (1 2 ) c x2 x1 L = u2 (1 2 ) c L L = u2 (1 2 ) c Or: u2 ) (10.19) c2 This means that from the viewpoint of an observer in the K frame, the length of the u2 moving rod has been reduced by a factor of (1 2 ). According to Einsteins theory c of relativity moving objects contract in the direction of their movement with respect to their rest length. (The length of the rod in the co-moving frame is referred to as rest-length.) This phenomenon is called Lorentz contraction. L=L (1 (10.15) x1 ut u2 (1 2 ) c (10.16)
(10.17)
(10.18)
Figure 10.3: Due to Lorentz contraction moving objects contract in the direction of their movement with respect to their rest length In classical physics time was the same in all frames of reference. According to equation (10.14) this is no longer true in relativistic physics. To understand the phenomena, 158
imagine that we try to measure the time that has passed between two events in both the K and K frames. (For example, imagine that a really fast race car is driving at a constant velocity between the start and nish line of a straight race track. We try to measure the time the car required to drive along the track. The two events are the car passing the start and the nish line.) Again, the K frame (the race car) is moving at a constant u velocity in the positive x direction with respect to the K frame (the race track). For the sake of simplicity let the two events take place at the same location in the K frame, thus x = x2 x1 = 0. (The drivers stopwatch is moving together with the race car, thus its position with respect to the race car does not change.) Since the K frame is moving with respect to the K frame, the two events do not take place at the same location in the K frame. (The rst event takes place at the start line, while the second event takes place at the nish line. The distance between the positions of the two events in the racetracks frame of reference is the length of the track.) Let us mark the time between the two events in the K frame by t = t2 t1 (This is the time measured be the racetracks built in timing system.), and in the K frame by t = t2 t1 . According to the Lorentz transformation: t = t2 t1 ux ux t1 + 21 t2 + 22 c c t = 2 u u2 (1 2 ) (1 2 ) c c u(x2 x1 ) t2 t1 + c2 t = 2 u (1 2 ) c t2 t1 t = u2 (1 2 ) c t t = u2 (1 2 ) c Since (1 (10.20) (10.21)
(10.22)
(10.23)
(10.24)
u2 ) is always smaller than 1, t is always longer than t . (The ocial c2 timing system of the racetrack will always measure the run-time of the driver to be slightly longer than his own measurement. From the viewpoint of the observers standing along the racetrack, the drivers clock is not accurate: it is ticking slightly slower than the stationary clock of the race track, as if time has slowed down in the race car.) The phenomenon is referred to as time dilatation. 159
To better understand the relationship between Lorentz contraction and time dilatation imagine that mankind is planning its rst interstellar expedition because our astronomers have found a habitable planet 200 light years away from Earth. The problem is that no space ships can travel faster than the speed of light, and since we have no working hibernation technology one would think, that none of the explorers would live long enough to reach their destination. But this is not the case. The theory of relativity oers a solution: if we can build a space ship, that is fast enough, Lorentz contraction and time dilatation would make the journey feasible. Lets say, that we are able to build a space ship that can travel at 99% of the speed u2 of light. In this case the value of the (1 2 ) factor is approximately 0.141. Although c it takes the spaceship 202 years to travel 200 light years at this velocity, the astronauts would still survive the trip. From the viewpoint of the people left behind on earth, time dilatation slows down time aboard the spaceship. Even though an earthbound observer would say that the journey took 202 years, the explorers aboard the spaceship would age only 28.5 years. But the astronauts on the spaceship would tell a dierent story. Remember that all motion is relative! From their viewpoint the entire universe is moving with respect to them at 99% of the speed of light. You should also remember that all moving bodies contract in the direction of their motion due to the Lorentz contraction. This means that from the viewpoint of the astronauts the distance they have to travel has shrank to 28.2 light years, and traveling at 99% light speed they can pass this distance in 28.5 years. Notice that although the stories of the earthbound observers and the astronauts are very dierent, the end result is the same: the astronauts have reached their destination, and they aged only 28.5 years during the journey. From the viewpoint of the earthbound observers this was due to time dilatation, while the astronauts would claim that it was Lorentz contraction which made their journey possible. Although this thought experiment may seem to be very futuristic, and it is unlikely that we will organise such an expedition in the foreseeable future, these eects have been veried by experiments. We might not be able to accelerate a large space ship to 99% of the speed of light, but we are capable to accelerate elementary particles to such high velocities. Just like people, certain elementary particles have limited lifetimes: they decay to other particles. According to classical physics this should limit the distance they can travel from their point of origin at a given velocity. But experiments show that (just like our ctional astronauts) they may reach considerably larger distances, due to relativistic eects. The results of these experiments show a good agreement with the predictions of Einsteins theory of relativity. The important lesson to remember is that space and time are not absolute: they depend on your point of view. Dierent observers may give dierent descriptions of the same experiment, but the end result is always the same. There is only one physical reality, and although the interpretation of events may vary with the viewpoint of the 160
observer, the laws of nature are the same in all inertial frames of reference.
10.4
Velocity addition
Imagine that there is an object moving at a v velocity in the positive x direction in the K inertial frame. Let us try to calculate the velocity of this object in the K frame, which is (yet again) moving at a constant u velocity in the same direction with respect to K . As we have noted earlier velocity measurements are really straightforward: all we have to do is calculate the distance the object travels in a given time, and calculate the ratio of the two values. If in the K frame the object is at the x1 position at the t1 moment, and at the x2 position at the t2 moment, its velocity is: v = x2 x1 t2 t1 (10.25)
We may use equations (10.13) and (10.14) to determine x1 , x2 , t1 and t2 from the corresponding coordinates in K : x2 ut2
2
v =
v = v = v =
x1 ut1 (10.26)
We may also rearrange (10.30) to calculate the velocity of the object in K from the velocity in K v= v +u uv 1+ 2 c 161 (10.31)
Imagine that an object is moving at 0.75c in the K frame, and K is also moving at 0.75c with respect to K . In classical physics, the Galilean transformation would predict, that an observer in the K frame would nd the velocity of the object to be 1.5c. But according to relativistic calculations the velocity of the object in the K frame is only v= 0.75c + 0.75c = 0.96c 1 + 0.5625 (10.32)
which is smaller than the speed of light. Even if u and v is 99% of the speed of light, v is going to be only 0.99995c. It doesnt matter how close u and v comes to the speed of light, the velocity of the object can never be higher than the speed of light in any frame of reference. This shows that the speed of light is not only the same in all frames of reference, it is also maximal velocity of any object.
10.5
Throughout this chapter we have discussed the dierences between relativistic physics, and classical physics. The formulas we have to use in relativistic calculations are markedly dierent from the formulas of classical mechanics, and there are a series of phenomena that could never exist in classical physics. It may seem like there is a direct contradiction between classical and relativistic physics. But how is this possible? The scientic method requires that all theories must be based on measurements and observations, and the predictions of the theories should be tested by further experiments. The laws of classical physics went through this process. They have been tested by countless measurements, and scientists havent found any deviations from their predictions for centuries. If classical physics is wrong, how could it pass all these checks? And even more importantly: how can we trust any physical theories after this? The answer is that classical physics is not really wrong, only less precise than modern physics. All scientic theories are based on experiments and measurements. The problem is that it doesnt matter how careful we are, no practical measurement can be completely precise. There is always some measurement error.5 The amount of measurement data also has practical limits. This means that theories are based on a nite number of measurements performed in a nite range of parameters with nite precision. If a theory is so close to the truth that the deviation is smaller than the error of our measurements, the problem may not be detected without new, more precise measurement techniques.
The only exception is, if we are measuring a discrete quantity, for example the number of some objects. But even in such cases, absolute precision might not be guaranteed. For example the number of particles released by a nuclear process may be counted precisely, but such processes tend to be stochastic. This means that there is a random variation in the number particles released by the sample. The measurement data will show a statistical uctuation, not because of the inaccuracy of our measurements, but because of the random nature of the process itself.
5
162
For example, classical mechanics was established based on measurements and observations performed at relatively low velocities (at least compared to the speed of light). At such velocities the predictions of relativistic physics are so close to the predictions of classical physics, that the deviation is hardly detectable, even with todays modern instruments. Consider the Lorentz transformation (equations (10.13) and (10.14)): x = x ut u2 (1 2 ) c ux t 2 c u2 (1 2 ) c (10.33)
t =
(10.34)
Although the formulas seem to be very dierent from the formulas of Galilean transformation (equations (10.1) and (10.2)), the deviation is very small at low velocities. The speed of light is three or four orders of magnitude higher than even celestial velocities. u2 u2 Therefore the 2 term is usually a very small number. Thus 1 2 is very close to 1, c c and its square root is even closer. This means that unless the velocities involved are close to the speed of light, the denominator on the right hand side of both equations ux may be omitted. In a similar manner, it is easy to see that 2 << t. Using these c approximations: x = x ut u2 (1 2 ) c ux t 2 c t u2 (1 2 ) c x ut (10.35)
t =
(10.36)
The Galilean transformation is the low velocity limit of the Lorentz transformation. In other words the predictions of the classical and the relativistic theory are practically the same at low velocities; no signicant deviations can be detected. This is the reason why scientists havent realised that there is something wrong with classical mechanics until the late 19th century. This also means that we may continue to use classical mechanics at these low velocities with impunity, despite of the fact, that we know that there are fundamental problems with the theory, because we also know that these problems would realise themselves only at very high velocities. This also reveals the relationship between the laws of physics that you may nd in textbooks, and the actual laws of nature. Scientists are trying to create a mathematical 163
model of the physical world around us. No one can guarantee that this model is completely precise, but scientists are always using the best and most precise techniques that are available at a given time to look for deviations. If any deviation is found the model is revised to account for the newly found phenomena. This continuous, self-critical revision and improvement is the key to the success of science. We may trust that the predictions of scientic theories are always precise enough for any practical application, because they are always based on the best and most precise measurements that our civilisation can produce at a given time.
164