0% found this document useful (0 votes)
72 views15 pages

Signal Selector Notes

The white paper discusses the process of selecting optimal signals for predicting stock returns, emphasizing the importance of reducing multicollinearity and selecting the best return-predicting signals. It outlines various techniques for multicollinearity reduction, including correlation plots, Singular Value Decomposition (SVD), and Variance Inflation Factors (VIF), as well as methods for signal selection such as Stepwise Regression and Lasso Regression. The paper highlights the use of the FactSet Programmatic Environment (FPE) SignalSelector module to facilitate these processes.

Uploaded by

sm stud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views15 pages

Signal Selector Notes

The white paper discusses the process of selecting optimal signals for predicting stock returns, emphasizing the importance of reducing multicollinearity and selecting the best return-predicting signals. It outlines various techniques for multicollinearity reduction, including correlation plots, Singular Value Decomposition (SVD), and Variance Inflation Factors (VIF), as well as methods for signal selection such as Stepwise Regression and Lasso Regression. The paper highlights the use of the FactSet Programmatic Environment (FPE) SignalSelector module to facilitate these processes.

Uploaded by

sm stud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

White Paper

Signal Selector

Jonas Svallin
Senior Director Quantitative Solutions
[email protected]

Georgi Mitov
Director of Research
[email protected]

Nikolay Radev
Senior Quantitative Researcher
[email protected]

Alexander Atanasov
Quantitative Researcher
[email protected]

January 28, 2025

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 1 FactSet Research Systems Inc. | www.factset.com
White Paper

Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Multicollinearity Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Correlation Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2. Singular Value Decomposition (SVD) Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3. Variance Inflation Factor (VIF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Selecting the Best Return-Predicting Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Stepwise Regression Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2. Monte Carlo Stepwise Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3. Lasso Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Appendix 14

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 2 FactSet Research Systems Inc. | www.factset.com
White Paper

1. Introduction
With sufficient time and energy, numerous signals (also referred to as factors) can be discovered that show
a significant correlation between values and subsequent returns. However, quantitative investment man-
agement, much like portfolio construction, is ultimately about selecting a set of signals that collectively
are correlated to future returns. This implies an assumption: the presence of more signals, up to a point,
provides diversification properties. Hence, it is crucial to select the most suitable set of signals from the
broader pool available. Practically, most investment professionals use more than one alpha source, making
it crucial to optimally combine these multiple alpha signals. The process of blending alpha signals optimally
is complex and includes different stages (Fig. 1). We will discuss the first step, suitability selection, in detail
and show how the FactSet Programmatic Environment (FPE) SignalSelector module can be utilized for
that purpose.

Figure 1: Equity research workflow

Variable selection for a linear regression model is more of an art than a science. While there are various selec-
tion algorithms and practices, none guarantee universal applicability. Frequently, the final solution combines
expert opinion with algorithmic output. FPE users have access to a suite of algorithms and visualization
tools enabling adequate decision-making. Signal selection often proceeds along two main avenues:

• Reduction of multicollinearity

• Selection of the subset that best predicts stock returns without overfitting

The first approach aims to minimize the number of signals by eliminating those that either duplicate or
correlate too highly with other signals. The second aims to choose the signals that best predict returns.
Often these methods are applied in combination – first reducing to a linearly independent set, and then
selecting the best regression model features out of this set. These two algorithmic steps complement each
other effectively as the first step tends to manage larger signal sets, while the second step performs more
reliably and efficiently with smaller, linearly independent sets. In Section 2., we examine the available
techniques used for reducing multicollinearity: Correlations Plot, Singular Value Decomposition (SVD),
and Variance Inflation Factors (VIF). In Section 3., we discuss methods used for signal selection based on

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 3 FactSet Research Systems Inc. | www.factset.com
White Paper

predictive capability for returns: Stepwise Regression (SwR), Monte Carlo Stepwise Regression (MCSwR),
and Lasso Regression.

2. Multicollinearity Reduction
Multicollinearity in a regression model means that one or more explanatory variables (signals) are linear
functions of others, or that high correlations are present. This presents a problem, as explanatory vari-
ables should be independent. The presence of multicollinearity subsequently compromises the reliability of
statistical inferences during the model fitting and interpretation.

2.1. Correlation Plot


A fast but simplistic way of identifying multicollinearity is the correlation plot. It allows for a quick identi-
fication of high pairwise correlations amongst signals. However, it offers limited insights into intricate linear
relationships between them. It is possible that small individual correlations may be present between a set
of three or more signals, yet a strong linear relationship between them can exist (Fig. 2). Owing to this
limitation, more advanced techniques are proposed in the following subsections.
A correlation plot produced by the SignalSelector module is provided in Figure 2. In this example, the
‘Value Composite’ is the equal-weighted average of the other four Value signals, but its individual correlations
to them are lower than the correlation between the Velocity and Short-Term Reversal Technical signals, which
are distinct signals though naturally correlated.

2.2. Singular Value Decomposition (SVD) Analysis


Having exact linear relationships within the variables of a linear regression poses significant problems – the
covariance matrix is singular and cannot be properly inverted, and the regression has degenerate solutions.
For this reason, we have equipped FPE with the functionality to identify and present detailed information
on these linear relationships to the user, who can then make an informed decision on which signals to remove
before proceeding with regression analyses. We employ Singular Value Decomposition (SVD) analysis of the
signals matrix (where each signal is a column) to extract the linear relationship coefficients for all the exact
linear relations in the set.
Figure 3 shows an example report from the SVD method of the SignalSelector module. The algorithm was
applied to a set of 17 signals, and it discovered that there is a linear relationship between five of them, and
that one of these can be removed without loss of information. The report also shows the linear relationship,
which as expected shows that the ‘Value Composite’ is the equal-weighted average of the other four ‘Value–...’
signals.

2.3. Variance Inflation Factor (VIF)


SVD takes care of exact linear relationships, and a correlation plot can give some insight into multicollinearity
in a set of signals, but researchers often need an unambiguously quantifiable measure of near-multicollinearity
in a regression model (as opposed to exact linear relations). Algorithms to reduce this multicollinearity, once
it’s quantified, are also necessary. Variance Inflation Factors (VIF) are reliable and widely used metrics for
multicollinearity [3]. VIF is a measure of the correlation of each independent variable (signal) in a set to a
linear combination of the rest. The VIF for the j-th variable is:

1
V IFj = (1)
1 − Rj2

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 4 FactSet Research Systems Inc. | www.factset.com
White Paper

Figure 2: An example Correlation Plot from the SignalSelector module.

where Rj2 is the R2 -value obtained by regressing the j-th explanatory variable on all the other remaining
variables:


xj,t = b0 + xi,t bi,t + ϵt (2)
i̸=j

VIF can be calculated for each independent variable, and a high VIF indicates that the corresponding variable
is highly collinear with the other variables in the model. Typically, VIF values above 3 ∼ 5 are considered
high and indicate problematic multicollinearity [3]. Based on this quantity, we provide a procedure to try
to eliminate the multicollinearity in a set of variables. The procedure iteratively removes the variable with
the highest VIF value until all variables have values lower than a selected threshold (VIFs are recalculated
after each removal, as the regression in Eq. (2) changes). Figure 4 shows an example report from the VIF
selection method of the SignalSelector module. In that example, two of the signals are strongly correlated
to each other and have VIFs above the specified threshold. One of them is dropped, causing the other’s VIF
to drop below the threshold.
So far, we have discussed various methods to reduce a pool of signals to a linearly independent subset but
have not yet matched any of these signals to returns. In the following section, we present algorithms that
select signals based on how well they predict stock returns.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 5 FactSet Research Systems Inc. | www.factset.com
White Paper

------------------------------------
Singular Values Decomposition Report
------------------------------------
Signals list (17):
['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield',
'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Quality-- Interest Coverage', 'Quality-- Piotroski F-Score', 'Technical-- Avg True Range',
'Technical-- Short-Term Reversal', 'Technical-- Velocity', 'Technical-- DownBetaR2', 'Value Composite']

Signal scores were used.

Results
-------
There are total of 1 redundant signals.

Linear dependence found in the following groups of signals, with number of redundant signals in brackets:
('Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield', 'Value Composite') (1)

The linear dependence relations are:


(0.250)*Value-- Book Yield + (0.250)*Value-- Earnings Yield + (0.250)*Value-- Sales Yield
+ (0.250)*Value-- CFO Yield + (-1.000)*Value Composite = 0

Figure 3: An example report of the Singular Value Decomposition method of the SignalSelector module.

3. Selecting the Best Return-Predicting Signals


In this section we focus on creating a regression model that predicts returns:


rt = b0 + bi xi,(t−1−lag) + ϵt (3)
i

While having a linearly independent set of signals is desirable for a reliable regression model, not every
signal is necessary for a good model – overfitting and data mining are common pitfalls that must be avoided.
Information theory provides some useful ways of quantifying the quality of a regression model as a balance
between goodness of fit and limiting the number of independent variables. Information criteria, like the
Bayesian Information Criterion (BIC) (4), Akaike Information Criterion (AIC) (5), and Corrected Akaike
Information Criterion (AICc) (6), are powerful and well-established statistical metrics used in many scientific
fields [1, 2].

BIC = k ln (n) − 2 ln ( ˆL) (4)

AIC = 2k − 2 ln ( ˆL) (5)

2k 2 + 2k
AICc = AIC + (6)
n−k−1

Here, for a given regression model,( k is the number of ) variables (signals), n is the sample size (number
of data points), and ln (L̂) = − n2 ln (2π) + ln ( ssr
n ) + 1 is the maximized log-likelihood function, with ssr
being the sum of squared residuals.
We can use these criteria as the target of optimization for our signal selection methods. There are no
analytic solutions to this discrete problem, and while it would seem attractive to run a best subset regression
(exhaustive search) the combinatorial explosion becomes unmanageable even with as few as 20 signals (more

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 6 FactSet Research Systems Inc. | www.factset.com
White Paper

---------------------------------------------------------------
Variance Inflation Factor selection: 2020-12-31, universe
---------------------------------------------------------------

Parameters:
-----------
Time window: 47

Considered signals list (16):


['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield',
'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Quality-- Interest Coverage', 'Quality-- Piotroski F-Score', 'Technical-- Avg True Range',
'Technical-- Short-Term Reversal', 'Technical-- Velocity', 'Technical-- DownBetaR2']

Protected signals list (0):


[]

Signal scores were used.

VIF Threshold: 4

Results:
--------
Selected signals (15):
['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield',
'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Quality-- Interest Coverage', 'Quality-- Piotroski F-Score', 'Technical-- Avg True Range',
'Technical-- Velocity', 'Technical-- DownBetaR2']

Removed signals (1):


['Technical-- Short-Term Reversal']

WARNING: VIF selection only drops signals to reduce multicollinearity, but not all selected signals
are necessarily significant to explaining and predicting returns

Figure 4: A report from the Variance Inflation Factors (VIF) selection method of the SignalSelector Module. One
of a pair of correlated signals was dropped.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 7 FactSet Research Systems Inc. | www.factset.com
White Paper

than a million combinations and, at 40 signals, a trillion combinations). Therefore, a more efficient process
that directedly converges on the desired solution is necessary. The following two subsections present two
methods available in the SignalSelector module of FPE that leverage information theory in the selection
of signals that best predict returns.
The final subsection covers a method that adopts an alternative approach to finding a constrained optimal
subset of return predicting signals - Lasso Regression. It is a constrained version of the simple least squares
regression:
{ }
min ∥rt − b0 − x(t−1−lag) b∥22 , subject to ∥b∥1 ≤ L (7)
b0 ,b

An equivalent formulation of the Lasso Regression, and also the one used in SignalSelector, is the La-
grangian form:

{ }
1
min ∥rt − b0 − x(t−1−lag) b∥22 + λ∥b∥1 (8)
b0 ,b N

3.1. Stepwise Regression Selection


Our Stepwise Regression (SwR) method implements a bidirectional approach to optimize a selected infor-
mation criterion from the above (4, 5, 6). It needs a starting point model (3) that includes a subset of the
available signals (can also include all or no signals) and an intercept. The starting model need not be ‘good’
or motivated in any way. The information criterion of choice of the current (starting) model is calculated,
a search is performed for the best signal to add to or remove from it in order to maximally improve the
information criterion, and then the current model is updated accordingly. This step is repeated until no
further improvement can be made to the information criterion, akin to gradient descent. Note that there
is no direct control over the cardinality of the selected set; rather, the number of signals selected for the
‘best fit’ is dictated by the penalty imposed in the definition of the respective information criterion. BIC
has a dynamic cardinality penalty with respect to sample size, which is more stringent than AIC and AICc
for sample sizes n ≥ 7 (in all practical cases). AIC and AICc are effectively the same for large sample sizes
(n ≫ k), but AICc has a quadratic correction to the cardinality penalty, which improves its performance in
the smaller sample size regime (smaller asset universe, shorter time window). Figure 5 shows an example
Stepwise Regression selection report from the FPE SignalSelector module. There, the method was applied
to a set of 16 signals, optimizing for AIC, with an arbitrary starting model of four signals. It took five steps
– adding four signals and removing one – to reach the termination point (local minimum of AIC).
One potential drawback of this ‘gradient-descent-style’ algorithm is that there can exist multiple local minima
of the information criterion in the parameter space, but only one of them is the best global solution. In
stepwise regressions, this problem comes in the form of feature suppression. For example, replacing a variable
with another in the model takes two steps (removing one and adding the other, in any order), and if both of
these steps would make the current model worse individually, the combined replacement will not be made
even if the overall result would be an improvement. Because of this, it is possible to get different solutions
depending on the starting model. In the following subsection, we present an enhanced method that can
compensate for the drawbacks of the stepwise regression.

3.2. Monte Carlo Stepwise Regression


An easy way to counter the starting model sensitivity of stepwise regression is to perform multiple iterations of
it with different starting points and pick the best solution. That is what our Monte Carlo Stepwise Regression

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 8 FactSet Research Systems Inc. | www.factset.com
White Paper

---------------------------------------------------------
Stepwise Regression selection: 2020-12-31, universe
---------------------------------------------------------

Parameters:
-----------
Time window: 47

Considered signals list (16):


['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield',
'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Quality-- Interest Coverage', 'Quality-- Piotroski F-Score', 'Technical-- Avg True Range',
'Technical-- Short-Term Reversal', 'Technical-- Velocity', 'Technical-- DownBetaR2']

Protected signals list (0):


[]

Starting model signals (4):


['Value-- Book Yield', 'Sentiment-- PT Revisions', 'Quality-- Interest Coverage', 'Technical-- DownBetaR2']

Information criterion: AIC

Returns lag: 0 periods

Results:
--------
Selected signals (7):
['Value-- Book Yield', 'Value-- Earnings Yield', 'Quality-- FCF Mgn Stability', 'Quality-- Interest Coverage',
'Technical-- Avg True Range', 'Technical-- Short-Term Reversal', 'Technical-- DownBetaR2']

Removed signals (9):


['Value-- Sales Yield', 'Value-- CFO Yield', 'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions',
'Sentiment-- Earnings Est. Revisions', 'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn',
'Quality-- Piotroski F-Score', 'Technical-- Velocity']

AIC: -44973.03271716798

R-squared: 0.0029447592175323445

Log of the procedure steps:


---------------------------

Figure 5: A report from the Stepwise Regression selection method of the SignalSelector Module. Detailed log of
the steps taken by the procedure and the evolution of the information criterion and R2 can be seen.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 9 FactSet Research Systems Inc. | www.factset.com
White Paper

method does, using a randomized starting model for each iteration. The starting model is randomly selected
by ’tossing a coin’ for each available signal to decide whether it is included in the beginning model. It also
provides statistical information that helps gauge the confidence in the reached solution (Fig. 6). We have
found, throughout testing (including exhaustive brute force validation for small enough sets of signals), that
the number of iterations, N , should be at least:

N ≳ (number of signals)

Additionally, a potential further increase of N , ensuring that

N ≫ (number of different solutions found throughout iterations)

is also desirable for strong confidence that the best global solution was found among these iterations. Ideally,
each of the different solutions should occur more than a handful of times, say more than the number of
different solutions found. This last one is not a requirement, as there can often be ‘rare’ solutions only
reached from very few specific starting models, and these are almost never the best solutions. An example
can be seen in Figure 6, where 50 iterations were used to select an optimal BIC subset from among 15 signals
(50 > 16). Two potential solutions were found (50 ≫ 2), and the solutions were reached 20 and 30 times
respectively (20, 30 > 2). In fact, the better, and more commonly found, of these two solutions is the best
global solution as confirmed by an exhaustive subset search in this case.
The Appendix provides empirical statistics, showing the reliability and scalability (as compared to brute
force/exhaustive search methods) of the stepwise regression methods. Monte Carlo Stepwise Regression is
significantly more tractable with respect to the number of signals, and capable of reliably finding the optimal
information criterion subset from a pool of signals numbering up to the low hundreds (100 ∼ 200) within
less than an hour (that would take more than the current age of the universe with a simple exhaustive search
algorithm).

3.3. Lasso Regression


The Lasso Regression is a simpler and numerically faster method than optimizing for an Information Cri-
terion, however it is more naive in the way signals are selected/removed - simply penalising the absolute
size of the regression coefficients. The larger the penalty parameter λ, more of the regression coefficients
become 0, effectively removing the corresponding signals from the model. The presence of a free parameter
in the model allows for direct control of the number of selected signals. Our SignalSelector method for
Lasso Regression provides direct control of the penalty parameter λ, or alternatively, a desired number of
selected signals (k) can be provided to the method and it would find the penalty parameter that achieves it
and select that many signals. Another benefit of the Lasso Regression is that it is fully and deterministically
solvable (numerically), so it finds the same definitive solution of Eq. (8) every time. An example report from
the Lasso Regression method of SignalSelector in Figure 7, where the a desired number of selected signals
option was used to select 10 of the 16 signals.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 10 FactSet Research Systems Inc. | www.factset.com
White Paper

---------------------------------------------------------------
Monte Carlo Stepwise Regression selection: 2020-12-31, universe
---------------------------------------------------------------

Parameters:
-----------
Time window: 47

Considered signals list (15):


['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield',
'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Quality-- Piotroski F-Score', 'Technical-- Avg True Range', 'Technical-- Short-Term Reversal',
'Technical-- Velocity', 'Technical-- DownBetaR2']

Protected signals list (0):


[]

Information criterion: BIC

Returns lag: 0 periods

Number of iterations: 50

Results:
--------
Selected signals (3):
['Value-- Book Yield', 'Value-- Earnings Yield', 'Technical-- Short-Term Reversal']

Removed signals (12):


['Value-- Sales Yield', 'Value-- CFO Yield', 'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions',
'Sentiment-- Earnings Est. Revisions', 'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn',
'Quality-- FCF Mgn Stability', 'Quality-- Piotroski F-Score', 'Technical-- Avg True Range',
'Technical-- Velocity', 'Technical-- DownBetaR2']

BIC: -44930.668554305994

R-squared: 0.0021769390469288386

Table with data of signal selection frequencies:


SignalSelector.mc_stepwise_regression_results['signal_frequencies']['total_period']['universe']

Statistics of the found solutions


---------------------------------

Figure 6: A report from the Monte Carlo Stepwise Regression selection method of the SignalSelector Module.
Breakdown of the local solutions found can be seen.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 11 FactSet Research Systems Inc. | www.factset.com
White Paper

------------------------------------------------
Lasso Regression selection: 2020-12-31, universe
------------------------------------------------

Parameters:
-----------
Time window: 47

Considered signals list (16):


['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- Sales Yield', 'Value-- CFO Yield',
'Sentiment-- Stdzd Analyst PT', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Quality-- Interest Coverage', 'Quality-- Piotroski F-Score', 'Technical-- Avg True Range',
'Technical-- Short-Term Reversal', 'Technical-- Velocity', 'Technical-- DownBetaR2']

Desired number of signals, k: 10

Results:
--------
Selected signals (10):
['Value-- Book Yield', 'Value-- Earnings Yield', 'Value-- CFO Yield', 'Sentiment-- Stdzd Analyst PT',
'Sentiment-- Earnings Est Stability', 'Quality-- FCF Mgn', 'Quality-- FCF Mgn Stability',
'Technical-- Avg True Range', 'Technical-- Short-Term Reversal', 'Technical-- DownBetaR2']

Removed signals (6):


['Value-- Sales Yield', 'Sentiment-- PT Revisions', 'Sentiment-- Earnings Est. Revisions',
'Quality-- Interest Coverage', 'Quality-- Piotroski F-Score', 'Technical-- Velocity']

Lambda: 0.0005232991146814947

Figure 7: A report from the Lasso Regression selection of the SignalSelector Module. It was ran with a desired
number of signals specified (k = 10) and the Lambda (λ) that achieves that can be seen in the results.

4. Conclusion
We have presented the diverse array of signal suitability selection tools in the SignalSelector module of
FPE. These tools are designed to help sort through the ever-growing zoo of signals and pick the ones that
can be combined into a desirable alpha signal. We have provided several complementary techniques for
detecting and reducing multicollinearity. These synergize with the regression-based methods for selecting
signals with the best return-predicting power, as they provide the most reliable statistical inferences when
applied to linearly independent sets of signals.
The SignalSelector methods can be conveniently applied to the entire universe (panel regression across
time and assets), as well as user-specified groups of stocks (e.g., by sectors) and/or sub-periods (e.g., for
out-of-sample validation or studying signal decay). All of these tools are deeply integrated with the Backtest
module, the backbone of FPE equity workflow, ensuring seamless transition into the subsequent stages –
calculating optimal signal weights (OptimalWeightsEngine) and backtesting.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 12 FactSet Research Systems Inc. | www.factset.com
White Paper

Bibliography

[1] K.P. Burnham and D.R. Anderson. Model Selection and Multimodel Inference: A Practical
Information-Theoretic Approach. Springer New York, 2003. ISBN 9780387953649. URL
https://books.google.bg/books?id=BQYR6js0CC8C.
[2] Sadanori Konishi and Genshiro Kitagawa. Information Criteria and Statistical Modeling. Springer
Publishing Company, Incorporated, 1st edition, 2007. ISBN 0387718869.
[3] I. Pardoe. Applied Regression Modeling. John Wiley & Sons, Ltd, 2020. ISBN 9781119615941. doi:
https://doi.org/10.1002/9781119615941.ch5. URL
https://onlinelibrary.wiley.com/doi/abs/10.1002/9781119615941.ch5.

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 13 FactSet Research Systems Inc. | www.factset.com
White Paper

Appendix

Here, we present empirical statistics showcasing the reliability and scalability of the Monte Carlo Stepwise
Regression selection method we have implemented. The setup we used was the following:
• Russell 1000 asset universe
• Two-year time window (2019-2021), monthly frequency
• 62 potential signals (linearly independent to V IF < 5):
['Tax Burden', 'Tangible Book to Price', 'Standardized Analyst Price Target', 'Sales to Price',
'Sales Estimate Stability', 'Sales Estimate Revisions (75D)', 'Return on Invested Capital Change',
'Return on Invested Capital', 'Return on Assets Change', 'Retention Ratio', 'Price Target Estimate Stability',
'Price Target Estimate Revisions (75D)', 'Piotroski F Score', 'Operating Margin', 'Operating Cash Flow Yield',
'Operating Cash Flow Stability', 'Operating Cash Flow Margin Stability', 'Operating Cash Flow Margin Change',
'Operating Cash Flow Growth - 1 Year', 'Net Margin Stability', 'Market Share Industry Group Growth Rate',
'Liability Coverage Ratio Change', 'Liability Coverage Ratio', 'Interest Coverage Ratio Change',
'Interest Coverage Ratio', 'Interest Burden', 'Intangible Assets to Sales',
'Free Cash Flow to Enterprise Value', 'Free Cash Flow Margin Stability', 'Free Cash Flow Margin Change',
'Free Cash Flow Margin', 'Equity Turnover Change', 'Equity Turnover', 'Equity Issuance Growth',
'Equity Buyback Ratio', 'Earnings Estimate Stability','Earnings Estimate Revisions (75D)', 'EPS Stability',
'EPS Growth Rate', 'EBITDA Stability', 'EBITDA Margin Stability', 'EBITDA Margin Change', 'EBITDA Margin',
'EBIT to Enterprise Value', 'EBIT Margin Change', 'EBIT Estimate Stability','EBIT Estimate Revisions (75D)',
'Debt Service Ratio', 'Debt Issuance Growth', 'Change in Intangible Assets to Sales',
'Cash Generating Power Ratio', 'Cash Earnings Ratio', 'Cash Coverage Ratio Change', 'Cash Coverage Ratio',
'CAPEX to Depreciation', 'CAPEX Growth', 'Beneish M Score', 'Asset Turnover Change', 'Asset Turnover',
'Asset Growth', 'Accruals Ratio - Cash Flow Method','Accruals Ratio - Balance Sheet Method']

We applied the Monte Carlo Stepwise Regression (MCSwR) method to find the combination of signals that
best predicts stock returns across this universe and time window. This was repeated for all three available
information criteria (AIC, AICc, BIC) as the optimization target. Similar analysis was performed on random
samples of the whole set of 62 signals to demonstrate the scalability of the methods. For the sufficiently small
sample signal pools (n ≤ 20), an exhaustive calculation of all information criteria for all possible combinations
was performed, using our optimized framework (same as for SwR and MCSwR). This exhaustive calculation
was used to validate that the best solution is indeed found by the MCSwR method. All these calculations
were run on the same machine with an Intel Core i7-12800, 2400MHz, 14 Cores, 20 Logical Processors CPU.
Table 1 shows these results, with a few notable features:
• Brute force method (exhaustive subset search) time is exponential with the number of signals – it could
not finish (DNF) for 25 signals, though it should have taken ∼ 5 days, by extrapolation.
• MCSwR is considerably more tractable with respect to the number of signals and can feasibly work
with numbers in the low hundreds.
• For all brute force feasible cases (n ≤ 20), MCSwR provably finds the global solution much faster.
• The best solution, found throughout the iterations, is always the one that occurs most often – this is
expected behavior and has been observed in all of our experiments, but is not strictly guaranteed.
• From the two observations above, we can infer strong confidence that MCSwR also finds the global
solutions in cases with a bigger number of available signals (n ≳ 20), even if brute force validation is

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 14 FactSet Research Systems Inc. | www.factset.com
White Paper

infeasible (taking from weeks at ∼ 30 signals to the age of the universe at ∼ 64).
• For the full case (62 signals), both the 50 and 500 iteration runs find the same best solution; the latter
offers stronger statistical confidence that this is indeed the global solution.
• Even a simple Stepwise Regression (equivalent to a single iteration of MCSwR) has a high probability
of finding the global solution, though more iterations for statistical confidence are advisable. See last
column, showing the percentage of random starting point iterations (single stepwise regressions) that
land on the global solution.

Table 1: Empirical Statistics for Monte Carlo Stepwise Regression Selection method. For small signal selection pools
(n ≤ 20), a brute force method was also used, validating that the global solution was found.

Methods Used Calculation Number of Confirmed Number of Number of Best


Time, MC Global Local Signals in Solution
t Iterations, Solution Solutions the Best Occurrence
N Found Found Solution Frequency
Selecting from the full set of 62 signals
MCSwR - AIC 164 sec 50 - 19 15 32%
MCSwR - AICc 158 sec 50 - 19 15 28%
MCSwR - BIC 145 sec 50 - 3 3 90%
MCSwR - AIC 26 min 500 - 33 15 30.6%
MCSwR - AICc 27 min 500 - 39 15 37%
MCSwR - BIC 26 min 500 - 3 3 89%
Selecting from a random subset - 35 signals
MCSwR - AIC 46 sec 50 - 5 7 64%
MCSwR - AICc 43 sec 50 - 6 7 44%
MCSwR - BIC 41 sec 50 - 2 2 78%
Selecting from a random subset - 25 signals
Brute Force (DNF) ∼ 5 days - - - - -
MCSwR - AIC 23 sec 50 - 3 8 40%
MCSwR - AICc 23 sec 50 - 3 8 40%
MCSwR - BIC 22 sec 50 - 1 2 100%
Selecting from a random subset - 20 signals
Brute Force 5.5 hrs - - - - -
MCSwR - AIC 15 sec 50 yes 3 7 78%
MCSwR - AICc 15 sec 50 yes 3 7 78%
MCSwR - BIC 15 sec 50 yes 1 2 100%
Selecting from a random subset - 17 signals
Brute Force 9.7 min - - - - -
MCSwR - AIC 12 sec 50 yes 3 5 82%
MCSwR - AICc 12 sec 50 yes 3 5 80%
MCSwR - BIC 12 sec 50 yes 2 2 54%
Selecting from a random subset - 15 signals
Brute Force 73 sec - - - - -
MCSwR - AIC 10 sec 50 yes 2 5 50%
MCSwR - AICc 10 sec 50 yes 2 5 54%
MCSwR - BIC 10 sec 50 yes 3 1 56%

Copyright © 2025 FactSet Research Systems Inc. All rights reserved. 15 FactSet Research Systems Inc. | www.factset.com

You might also like