We are interested in solving direct (as opposed to inverse) electromagnetic (EM) scattering problems with multiple spheroidal scatterers. Denoting the EM filed by E(r, t) ∈ ℂ3, where r is a point in ℝ3 and t is time, we assume time dependence so that e−iωt factors out and the time dimension can be treated completely independently. We also make certain assumptions about the medium and the scatterer material properties, for which the relevant equation to be satisfied is the homogeneous Helmholtz equation, as well as the appropriate boundary conditions (BCs).
All contributions to the total EM field (“radiation”), including the incident and the scattered fields, are represented by solutions of the vector Helmholtz equation where ∇2 is the vector Laplacian, E(r) : ℝ3 ↦ ℂ3 must be divergence-free (i.e.~∇ ⋅ E = 0), and k ∈ ℂ is the or , related to the k ∈ ℂ3 where ω = 2πc/λ is the angular frequency, c is the speed of light, λ is the wavelength, and ϵ is the frequency-/wavelength-dependent dielectric function. It is often convenient to write k = kn, where n ∈ ℂ3 satisfies n ⋅ n = 1, but is not necessarily a unit vector. In Cartesian coordinates (x, y, z) it is relatively straightforward to verify that a stationary of the form E0exp (ik ⋅ r) is a solution of () for any constant E0 ∈ ℂ3 satisfying E0 ⋅ k = 0, since EM waves are transverse.
The solution must also satisfy the appropriate BCs at interfaces and/or infinity. Given the geometry of problems we address, the BCs are better expressed in spherical polar coordinates (r, θ, ϕ), with the general solution to () being a linear combination of waves. To build up the relevant formulae, it is useful to first consider the Helmholtz equation where E(r) : ℝ3 ↦ ℂ and ∇2 is the scalar Laplacian.
In spherical polar coordinates, () is satisfied by linear combinations of where zn(ζ) are Bessel functions of kind ζ ∈ {1, 2, 3, 4} and order n = 0…∞, and Ynm are the spherical harmonics with |m| ≤ n, defined using the Condon-Shortley phase. Hence, a general solution to () can be expressed as where ψ(ζ) is a vector, a ∈ ℂ is a vector of coefficients (ap ∈ ℂ), and is a composite index introduced for convenience, such that Note that, while in principle p can run from 0 up to ∞, in practice the series in () is made finite by truncating at some maximum multipole order $n_{\rm max}$, so that the column vectors $\mathbf{a} \in \mathbb{C}^{p_{\rm max}}$ and $\pmb{\psi}^{(\zeta)} \in \mathbb{C}^{p_{\rm max}}$ are of finite length $p_{\rm max} = n_{\rm max}^{2} + 2n_{\rm max}$.
For our purposes we only require ζ = 1 (zn(1) = jn, spherical Bessel functions of the first kind) and ζ = 3 (zn(3) = hn, spherical Hankel functions of the first kind), referred to as the and the functions, respectively, which are linearly independent. Hence, for brevity and notational convenience we may refer to ψnm(3) as simply ψnm, and ψnm(1) as ψ̃nm. Furthermore, the tilde will also be placed over the coefficients (e.g.~$\widetilde{\mathbf{a}}$) to explicitly indicate the regular basis-set flavour. Note that the existence of two linearly independent basis-sets stems from the fact that the underlying (Helmholtz) partial differential equation is of second order and is homogeneous.
Now, solving () in spherical polar coordinates leads to two sets of vector-valued functions Mnm(ζ) and Nnm(ζ) for n = 0…∞ and |m| ≤ n form a complete set of mutually orthogonal angular-dependent functions, independent of ζ. That is, where A and B are placeholders for M and N, Anm(ζ)* denotes the complex conjugate of Anm(ζ), and δij is the Kronecker delta. The radial function fn(A, ζ, ζ′)(kr) is actually a functional of jn(kr) and/or hn(kr), depending on ζ and ζ′, and its exact form also depends on whether A is M or N.
A general solution to (), up to a maximum multipole index $n_{\rm max}$, can be expressed as where $\widetilde{\mathbf{c}}\in \mathbb{C}^{l_{\rm max}}$ is a column vector of coefficients, $\widetilde{\mathbf{W}}$ is a of dimension $3\times l_{\rm max}$ (i.e. a row vector composed of column vectors $\widetilde{\mathbf{w}}_{l(q=1,n,m)} = \widetilde{\mathbf{M}}_{nm}$ and $\widetilde{\mathbf{w}}_{l(q=2,n,m)} = \widetilde{\mathbf{N}}_{nm}$), and $l_{\rm max}$ is the maximal value of another, more convenient, composite index where q ∈ {1, 2} is sometimes referred to as the or index, which is used to distinguish between the M and N functions.
The irregular variant of () is obtained by simply removing the overhead tildes (e.g.~$\widetilde{\mathbf{W}}\to \mathbf{W}$), which corresponds to switching the radial dependence from jn(kr) to hn(kr) throughout. Also, the most general solution to () is actually given by the superposition of regular and irregular solutions:
Outside a given scatterer, the total field $\mathbf{E}_{\rm tot}(\bf r) = \widetilde{\mathbf{E}}_{\rm inc}({\bf r}) + \mathbf{E}_{\rm sca}({\bf r})$ is partitioned into a known incident contribution $\widetilde{\mathbf{E}}_{\rm inc}({\bf r})$ and unknown scattered contribution $\mathbf{E}_{\rm sca}({\bf r})$. Both partitions are expanded in terms of VSWs up to some multipole order $n_{\rm max}$, where E is a parameter determining the field amplitude (usually $E = |\widetilde{\mathbf{E}}_{\rm inc} |$), and $\widetilde{\mathbf{c}}_{\rm inc}\in \mathbb{C}^{l_{\rm max}}$ with $\mathbf{c}_{\rm sca} \in \mathbb{C}^{l_{\rm max}}$ are the incident and scattered coefficients, respectively. The association of $\widetilde{\mathbf{E}}_{\rm inc}$ with regular (or ) and $\mathbf{E}_{\rm sca}$ with irregular (or ) VSWs is a choice motivated by physical reasoning: (i) $\widetilde{\mathbf{E}}_{\rm inc}(\mathbf{r})$ ought to be well defined everywhere within a finite distance from the origin, which rules out irregular VSWs due to their singular behaviour at r = 0; and (ii) $\mathbf{E}_{\rm sca}(\mathbf{r})$ ought to satisfy the outgoing (Sommerfeld?) BCs, requiring that $|\mathbf{E}_{\rm sca}(\mathbf{r})| \to 0$ at a particular (?) rate as |r| → ∞, which rules out regular VSWs due to their divergence in the far field (?). Given the linearity of the governing (Helmholtz) equations, imposition of particular BCs at the scatterer’s surface introduces a linear dependence among $\widetilde{\mathbf{c}}_{\rm inc}$ and $\mathbf{c}_{\rm sca}$, causing them to satisfy a relation of the form where T is the so-called “transition” or “transfer” matrix (T-matrix for short), which depends on the scatterer characteristics and is independent of $\widetilde{\mathbf{c}}_{\rm inc}$. The T-matrix elements are (?) determined by matching the BCs over the scatterer’s surface in a particular way, which is generally a non-trivial task. However, for a spherically symmetric, homogeneous scatterer centred at the origin, T is a diagonal matrix with the diagonal elements determined analytically by Mie theory. T-matrices can be calculated numerically for spheroids, as well as more general, axially symmetric scatterers, and… For multiple scatterers, the collective T-matrix can be built up iteratively from the individual one-body T-matrices.
In multiple scattering problems it is often either necessary or simply more convenient to express the same (partial) field in terms of different basis-sets, mutually related by a translation and/or rotation of the coordinate system. This section is concerned with relations between VSW basis-sets or coefficients defined in different reference frames.
Let (θ1, ϕ1) and (θ2, ϕ2) be the spherical angles of the same vector r in coordinate systems 1 and 2, respectively, sharing the same origin. If coordinate system 2 is obtained by rotating coordinate system 1 through Euler angles (α, β, γ), here defined in the “zyz” convention with 0 ≤ α < 2π, 0 ≤ β ≤ π, and 0 ≤ γ < 2π, then where Dμmn = e−iμαdμmn(β)e−imγ and dμmn are the Wigner D- and d-functions. Conveniently, ψ̃nm, $\widetilde{\mathbf{M}}_{nm}$, $\widetilde{\mathbf{N}}_{nm}$, Mnm and Nnm transform in exactly the same manner under rotation, so substituting ψnm by a desired basis function in () will give the appropriate expression. See equations (5.23)-(5.24) of Ref. for details. In our nomenclature, a basis set W(2) ≡ W(kr2) in coordinate system 2 is related to W(1) ≡ W(kr1) in coordinate system 1 via where R(α, β, γ) is a unitary block-diagonal matrix (of size $l_{\rm max} \times l_{\rm max}$), satisfying and the matrix elements given by where the index l(q, n, m) is defined in (). Note that () also applies to regular waves $\widetilde{\mathbf{W}}$. Now, if a (regular or irregular) spherical wave expansion is described by a vector of coefficients c(1) in coordinate system 1, as well as c(2) in coordinate system 2, then which follows from equating the field expansions and using (), i.e.
Let us re-label coordinate system 1 as G to indicate a global, space-fixed reference frame, and coordinate system 2 as L for local, scatterer-fixed frame. A T-matrix T(L) expressed in the local frame is transformed into T(G) = RT(L)R† in the global frame, where R(α, β, γ) depends on the Euler angles (α, β, γ) that rotate frame G onto frame L (as opposed to L onto G). For explanation, consider
If the scatterer is rotationally symmetric about the local z-axis, which is tilted by spherical polar angles (θ, ϕ) relative to the global z-axis, then α = ϕ, β = θ, and the value of γ is irrelevant due to axial symmetry, so we can choose γ = 0 to have T(G) = R(ϕ, θ, 0)T(L)R(0, −θ, −ϕ). See Sec.,5.2 of Ref.~ for more details.
Consider a point P with coordinates r1 in coordinate systems 1, with the origin at O1. If we choose another origin O2 with coordinates r12 relative to O1, then the new coordinates of P relative to O2 will be r2 = r1 − r12, as illustrated in Fig.~. The translation-addition theorem for vector spherical waves states that, in the limit $n_{\rm max} \to \infty$, where $\widetilde{\mathbf{O}}(k\mathbf{r}_{12})$ and O(kr12) are ($l_{\rm max} \times l_{\rm max}$) matrices of regular and irregular translation-addition coefficients (TACs), respectively. Note the conditional statement for irregular waves: the transformation depends on the relative length of r2 and r12. In Fig.~, an irregular basis centred at O1 is mapped onto a regular basis centred at O2 via the irregular TACs. However, if O2 were to the left of the bisector, so that r2 > r12, then the irregular basis centred at O1 would be mapped onto an irregular basis centred at O2 via the regular TACs. [What exactly happens when r12 = r2, near the bisector, or if one simply uses the wrong expression?] Perhaps a more helpful interpretation is that, if the origin is translated by r12 (from O1 to O2), then at points with the new coordinates r2 satisfying r2 < r12 (i.e.~within a ball of radius r12) the irregular waves are transformed into regular waves, and at all other points the irregular waves remain irregular. Why? To better handle the singularity in the irregular waves, I suppose…
Note that O1 and O2 are themselves points with coordinates x1 and x2, respectively, in some designated global “laboratory” frame with a fixed origin x0. If we denote the lab-frame coordinates of P by r, then r1 = r − x1, r2 = r − x2, and r12 = r1 − r2 = x2 − x1 ≡ x21. Hence, we follow Stout_et al_ and adopt the shorthand notation $\widetilde{\mathbf{O}}^{(i,j)} \equiv \widetilde{\mathbf{O}}(k\mathbf{x}_{ij}) = \widetilde{\mathbf{O}}(k\mathbf{r}_{ji})$, and likewise for O(i, j), yielding Note the reversal of indices i and j in xij = rji, which has caused some confusion in the past.
To express the translation-addition theorem in terms of the coefficients (c) of a VSW expansion, multiply (from the right) both sides of equations () and () by column vector c(i), where the superscript (i) indicates where the VSW expansion is centred. From inspection of the right-hand side we find that Note that (), () and () are in a similar form to the rotation equation (). Also, it is worth reiterating that the above equations hold exactly only in the limit of $n_{\rm max} \to \infty$.
A general translation, from xi to xj by xji = rij = (rij, θij, ϕij), can be separated into three steps:
This factorisation can be expressed in matrix form as for the irregular case, where O(z)(rij) represents the matrix of axial translation coefficients, many (how many?) of which are zero due to the special case of axial translation along z. Note that the three aforementioned steps correspond to stepwise movement of the local axis, from the perspective of the initial point i, and the transformation corresponds to reading the matrix multiplication in () from left to right; but the sequence of steps and the direction of movement is actually reversed from the perspective of the scatterer at the destination point. More importnantly, since all three matrix-factors on the right-hand side of () will contain many (how many?) zeroes, operating on a vector of VSW coefficients in a sequence of three steps can actually reduce the scaling of the net computational cost from $\sim n_{\rm max}^{4}$ to $\sim n_{\rm max}^{3}$, when the na"ive matrix multiplication on each step is replaced by a custom operation that sums just over the relevant (non-zero) components. See Appendix~ for more details.
Another potential advantage of using () is that, after obtaining the three factors for O(j, i), they can be recycled when computing the reverse translation O(i, j) to reduce the overall computational cost. First, beware that O(z)(−rij) is the inverse of O(z)(rij), and note that Oz(rij) is invariant to interchanging i and j (rij = rji ≥ 0). Actually, [O(z)(rij)]−1 = R(0, π, 0)O(z)(rij)R(0, −π, 0), where R(0, π, 0) is block-diagonal and each block is -diagonal. Second, since ϕij = π + ϕji and θij = π − θji, R(ϕji, θji, 0) can be calculated from R(ϕij, θij, 0) using the symmetry relation dμmn(π − θ) = (−1)n − md−μmn(θ) (see eqn.~B.7 in ref.), so that Dμmn(ϕji, θji, 0) = (−1)n − m + 1D−μmn(ϕij, θij, 0), where the extra prefactor of −1 comes from multiplying by e−iπ = −1.
To conclude this section, we note that computing O in general is actually more costly than the combined effort of computing O(z), R, R−1. However, performing matrix multiplication of the three factors in the right sequence (RO(z)R−1) does not seem to produce the expected result (O), perhaps symptomatic of the numerical subtleties involved in the calculation of O.
This section was written after first reading two papers by Stout_et al_~ and then three earlier papers by Mackowski_et al_~.
For a cluster of N individual scatterers (each of arbitrary shape), the collectively scattered field $\mathbf{E}_{\rm sca}^{\rm cl}$ can be separated into additive contributions from the individuals, i.e.: where rj = r − xj, xj is the position of the jth scatterer relative to a fixed origin x0. It is important to stress that the VSW expansion for each $\mathbf{E}_{\rm sca}^{(j)}(\mathbf{r})$ in () is developed in terms of irregular waves centred at xj, as indicated in the subscript of rj and the superscript of the corresponding coefficients $\mathbf{c}_{\rm sca}^{(j)}$. It is even more important to note that () does not actually prescribe how exactly the collective field is partitioned among the individuals.
It is useful to define the excitation field $\widetilde{\mathbf{E}}_{\rm exc}^{(j)}(\mathbf{r})$ for each scatterer j, and then develop it in terms of regular VSWs centred at xj, i.e. where $\widetilde{\mathbf{c}}_{\rm inc}^{(j)} = \widetilde{\mathbf{O}}^{(j,0)} \widetilde{\mathbf{c}}_{\rm inc}$ and $\widetilde{\mathbf{O}}^{(j,0)} \equiv \widetilde{\mathbf{O}}(\mathbf{x}_{j} - \mathbf{x}_{0})$. In the last equality of (), W(rl) is transformed into $\widetilde{\mathbf{W}}(\mathbf{r}_{j})$, with the correponding coefficients given by $\widetilde{\mathbf{c}}^{(j\leftarrow l)} = \mathbf{O}^{(j,l)} \mathbf{c}_{\rm sca}^{(l)}$, where the superscript (j ← l) specifies the scattered field partition of scatterer l developed (as a regular VSW expansion) about particle j. Note the application of translation-addition theorem clause that applies only inside the ball of radius rjl (for each l ≠ j) centred at xj). This approach is valid only if the translation ball fully contains the target scatterer’s surface, where the BCs are supposed to be matched. While non-overlapping spherical scatters are always guaranteed to satisfy this condition, Fig.~ shows that spheroids with aspect ratio greater than 2 can be problematic, because the t-ball can intersect the target scatterer’s surface if it is sufficiently close (yet still not overlapping). Note that Mackowski explicitly mentions this issue when discussing equations 5-8 in ref.~, but only in the context of spherical scatterers.
From linearity and imposition of appropriate boundary conditions, again, the excitation coefficients $\widetilde{\mathbf{c}}^{(j)}_{\rm exc}$ defined in () must be related to the corresponding scattering coefficients $\mathbf{c}^{(j)}_{\rm sca}$ via where the T1(j) are the one-body transfer matrices. Note that $\widetilde{\mathbf{c}}^{(j)}_{\rm exc} = \widetilde{\mathbf{c}}_{\rm inc}$ for an isolated (j = N = 1) scatterer at the origin; and if the scatterer is spherically symmetric then T1(1) is a diagonal matrix with Mie coefficients running along the diagonal. For non-spherical scatterers, it may also be helpful to rotate (as well as translate) the coordinates to achieve a favourable orientation. Anyway, from (), (), and () we obtain an equation expressed just in terms of the field coefficients: which is what Stout_et al_ regard as “the fundamental multiple scattering equation” (eqn.~9 in ref.~). It is helpful to rewrite the linear system () in block-matrix form: We can use () to substitute the excitation field coefficients for the scattered field coefficients and, assuming the one-body T-matrices are invertible, obtain We can also rearrange the linear system into which is equivalent to Mackowski’s eqn.~4 in ref.~. In Mackowski’s earlier papers, however, beware that the same “fundamental” equation (eqn.~13 in ref.~ and eqn.~3 in ref.~) has a plus sign instead of the minus, which may be entirely due to a dubious minus sign featuring in the incident field expansion (see eqn.~4 in ref.~). The dubious minus sign is absent in eqn.~2 of the more recent ref.~, and the subsequent eqn.~4 matches our equation (). Phew!
Note that (), (), and () are all in the form of a general matrix equation Ax = y, which can be solved for the column vector x without necessarily inverting the matrix A. However, formal inversion facilitates definition of auxiliary T-matrix constructions for multiple scatterers.
The system of coupled matrix equations in () can be solved for f(j) by inverting the matrix to obtain
where TN(j, i) represents pairwise N-body T-matrices, arranged and defined as follows
with the last two lines merely showing how the one-body T-matrices can be factored out in two different ways (left or right). [Q: How to interpret O(i, j)T1(j) and T1(i)O(i, j)?]
The TN(j, i) matrices form a complete multiple-scattering solution of the N-particle system. Stout et al calls these matrices “scatterer-centred tranfer matrices” (see eqn.~12 in ref.~), while Mackowski refers to them as “sphere-centred” (see eqn.~16 in ref.~ and eqn.~4 in ref.~). However, both terms seem equally applicable to TN(j), which Mackowski defines one way (see eqn.~61 in ref.~) but does not give it a name, while Stout et al define TN(j) differently and call it “individual N-body transfer matrices” (see eqns.~14 and 27 in ref.~). To avoid any possible confusion, we shall refer to TN(j, i) and Mackowski’s TN(j) as “pairwise” and “individual” N-body T-matrices, respectively, and we stress that TN(j) differs from T1(j) in equation ().
Mackowski’s “contraction” of the index i in TN(j, i) to obtain TN(j) essentially links the individually-centred scattering coefficients, f(j), with the incident field coefficients, $\widetilde{\mathbf{a}}$, for the regular VSW expansion centred at the common origin x0. Mackowski also considers the collective scattering coefficients f(0) for the irregular VSW expansion about the common origin, i.e.
where the second equality relies on a particular clause of the translation-addition theorem, which is valid only outside of the smallest circumscribing sphere (encompassing all N scatterers) centred at x0. From () and () we have
which is analogous to () and provides an explicit expression for the collective TN(0) in terms of the pairwise and (Mackowski’s) individual N-body T-matrices:
which is equivalent to Mackowski’s “cluster-centred” T-matrix (see eqn.~19 in ref.~, eqn.~64 in ref.~, and eqn.~29 in ref.~). Again, it is worth reiterating that () is valid only outside of the cluster’s smallest circumscribing sphere centred at the common origin, and Mackowski explicitly acknowledges this fact in refs.~! I suspect this limitation is present in most other studies that utilise a collective T-matrix, such as ref.~
The total energy transfer from an incident plane wave $\mathbf{E}_{\rm inc}$ into $\mathbf{E}_{\rm sca}$ and/or $\mathbf{E}_{\rm int}$ as a result of scattering can be quantified by the cross-section, $\sigma_{\rm ext}$, which in the current formalism can be calculated from the scatterer-centred coefficients, here for convenience denoted by $\mathbf{f}^{(j)} = \mathbf{c}_{\rm sca}^{(j)}$ and $\mathbf{a}^{(j)} = \widetilde{\mathbf{c}}_{\rm inc}^{(j)}$, using
where ℜ{…} indicates taking the real part of the quantity inside the braces, a(j)† is conjugate transpose of the column vector a(j), aqnm(j)* is the conjugate of the vector component aqnm(j), and $k_{\rm med}$ is the (real-valued) wave number for the (non-absorbing) medium. Note that () is from eqn.~29 of ref.~, which follows from simplifying eqn.~43 of ref.~ and substituting into eqn.~42 (of same reference); but it apparently differs from eqn.~H.65 of ref.~, where the conjugation is swapped, nonetheless producing the same result since ℜ{ab*} = ℜ{a*b} for any complex numbers a and b. Also note that $\sigma_{\rm ext}$ is separable into additive contributions from individual scatterers, i.e. $\sigma_{\rm ext} = \sum_{j}\sigma_{\rm ext}^{(j)}$. Mackowski cites the optical theorem when stating analogous expressions for $\sigma_{\rm ext}$ in eqns.~23-26 of ref.~, missing the minus sign (because of an extra minus sign in Mackowski’s definition of incident field coefficients) and featuring an extra factor of 4π (due to different normalisation condition for the angular part of VSWs, which can be seen by comparing the Emn in eqn.~21 of ref.~ with the γmn in eqn.~C.22 of ref.~). To spice things up, in Mackowski’s earlier paper the 4π is 2π in the extinction cross-section formula (eqn.~31 of ref.~), where the minus sign is also missing yet there is no extra minus sign in the definition of incident field coefficients to explain the discrepancy. In Mackowski’s latest paper, the 4π again becomes 2π in the extinction cross-section formula (eqn.~24 in ref.), which also includes the minus sign, consistent with the absence of the dubious minus sign in the definition of incident field coefficients. Note that Suryadharma and Rockstuhl seem to have the same “4π and no minus sign” convention as in ref.~, and the comparison between eqns.~64-65 of ref.~ may offer an alternative explanation for this pervasive discrepancy.
The scattering cross-section for a given excitation, $\sigma_{\rm sca}$, can be computed using
where the composite index l defined in () has been used for brevity. Note that () is also from eqn.~29 of ref.~, which follows from substituting eqn.~45 of ref.~ into eqn.~42 (of same reference). It seems to differ by a factor of 4π from eqn.~35 of ref.~, but for single scatterer (N = 1) it does reduce to eqn.~H.64 of ref.~. My understanding is that () is derived by considering the outward radial component of the Poynting vector for the scatterer field and integrating it over the surface of a sphere enclosing all the scatterers.
The absorption cross-section, $\sigma_{\rm abs}$ can be deduced using conservation of energy: or
Recall that $\sigma_{\rm ext}$, $\sigma_{\rm sca}$, and $\sigma_{\rm abs}$ are defined for a particular incident field direction. The corresponding orientationally averaged quantities are obtained by averaging over all possible orientations, producing $\langle \sigma_{\rm ext}\rangle$, $\langle \sigma_{\rm sca}\rangle$ and $\langle \sigma_{\rm abs}\rangle$.
The T-matrix formalism is founded on the premise that everything converges below a finite multipole order $n_{\rm max} < \infty$, i.e.~the field coefficients and T-matrix elements become negligible for $n > n_{\rm max}$. Hence, while the formalism (including the translation-addition theorem for VSWs) is exact only in the impractical limit of n → ∞, sufficiently accurate numerical results can be obtained with finite vectors and matrices truncated above $n_{\rm max}$. The question is: what is the appropriate choice of $n_{\rm max}$ for a given system and given numerical scheme? Should one choose just one value of $n_{\rm max}$, or multiple values, each used at a different stage of the scheme? How exactly does the truncation error propagate from stage to stage? Answering these questions requires some numerical analysis and convergence testing…
To “solve” a multiple scattering problem can mean different things. Usually it means to determine the scattering coefficients $\mathbf{c}^{(j)}_{\rm sca}$ for a given individual scatterer properties (described by T1(j)) and an excitation field (described by $\widetilde{\mathbf{c}}^{(j)}_{\rm inc}$) impinging from a particular direction. This kind of “solution” can be achieved by solving the linear system of equations in () for $\mathbf{a}^{(j)}_{\rm sca}$ performing matrix inversion, thus producing a complete description of the scattered field for the given excitation. The linear system would have to be solved again (entirely from scratch) for a different excitation. Performing the matrix inversion to determine the N-body T-matrices becomes worthwhile only when many impinging directions are to be considered, or when rotationally-averaged quantities are of primary interest.
A brute force approach to solving the multiple scattering problem would be to construct the NL × NL matrix in equation () and then invert it to obtain the pairwise matrices T(i, j). However, this approach is computationally demanding. To help alleviate the cost of one large matrix inversion, Stout_et al_ proposed a recursive scheme where a smaller (L × L) matrix is inverted N − 1 times.
In the recursive algorithm described by Stout_et al_, the elements of TN(j, i) are accumulated recursively from auxiliary subsystems, incrementally built up from one to N particles. The recursive system is prescribed by the following four equation: where a common matrix sum has been underlined. Note that only one L × L matrix is inverted on each of the N − 1 iterations. The inverted matrix becomes ill-conditioned for large $n_{\rm max}$, but the associated problems can be (at least partly) circumvented by applying the recursive scheme to appropriately “balanced” matrices and coefficients: where $\widetilde{\mathbf{D}}^{(j)}$ and D(j) are regular and irregular diagonal matrices with the Riccati-Bessel functions ψn(x) = xjn(x) or ξn(x) = xhn(x) on the diagonal. (Here, jn(x) and hn(x) are, respectively, the spherical Bessel and Hankel functions of the first kind). Note that, what Stout_et al_~ call “matrix balancing” could also be described as “left and right preconditioning”, because a matrix to be inverted is essentially left- and right-multiplied (i.e.~preconditioned) by two different matrices to improve its condition number, which is supposed to make numerical inversion more robust and accurate. Instead of balancing throughout, as Stout_et al_~ seem to do, it may be better to localise the balancing act just at the inversion step in (), i.e. where $\widehat{\mathbf{S}}$ is obtained by balancing S analogously to $\widehat{\mathbf{H}}^{(j,i)}$.
Note that equation () can be factored in two ways: either of which may be preferred if T1(N) is difficult to invert. To facilitate the inversion of SL and SR, slightly different balancing (and subsequent unbalancing) should be used: Here SL is balanced like irregular VTACs (or the inverse of a T-matrix) using only irregular weights, while SR is balanced like a T-matrix using only regular weights. In my experience, $\widehat{\mathbf{S}}_{R}$ is much better conditioned for inversion than $\widehat{\mathbf{S}}$, and I haven’t compared with $\widehat{\mathbf{S}}_{L}$.
From the equations () and () it follows that or, equivalently, which is a linear system of the general form AX = B, where we want to find matrix X for a given A and B, which contains multiple right hand sides. That is, each column of X and B can be treated independently, so we have to solve many linear systems of the form Axν = bν, where xν and bν are the ν’th column of X and B, respectively. This approach suggests that matrix inversion should be more difficult than solving a linear system, because the inversion actually requires solving many linear systems…]
Mackowski and Mishchenko (M{&}M) “contract” the second particle index of TN(j, i), using $\mathbf{T}_{N}^{(j)} = \sum_{i} \mathbf{T}_{N}^{(j,i)} \widetilde{\mathbf{O}}^{(i,0)}$, to rewrite the linear system in terms of (as opposed to pairwise) scatterer-centred T-matrices TN(j), i.e. or, equivalently,
where the number of independent linear systems of the form Axν = bν is now reduced. [Does this index contraction affect the “optimal” choice of $n_{\rm max}$?] M{&}M claim to use (bi)conjugate gradient method to solve Axν = bν for each ν, where the row order (i.e.~length of each column vector) is predetermined using Lorentz/Mie theory result for each (spherical) scatterer in isolation, and the truncation limit for the column order (i.e.~the maximum value of ν) is determined on-the-fly from the convergence of each scatterer’s contribution to the collective rotationally-veraged extinction cross-section (see eqn.~66 in ref.~).