0% found this document useful (0 votes)
36 views

Machine Learning From Stack Overflow

The document discusses migrating older, off-topic machine learning theory questions from Stack Overflow to Cross Validated. It proposes collaborating with Cross Validated moderators to close and migrate eligible questions over 60 days old. This could preserve valuable content while adhering to each site's scope. Concerns about reputation loss are raised, as well as ensuring target sites like Artificial Intelligence are also consulted given potential overlap. Cross Validated moderators have expressed openness, and the initiative now requires action to get started.

Uploaded by

FatihRahman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Machine Learning From Stack Overflow

The document discusses migrating older, off-topic machine learning theory questions from Stack Overflow to Cross Validated. It proposes collaborating with Cross Validated moderators to close and migrate eligible questions over 60 days old. This could preserve valuable content while adhering to each site's scope. Concerns about reputation loss are raised, as well as ensuring target sites like Artificial Intelligence are also consulted given potential overlap. Cross Validated moderators have expressed openness, and the initiative now requires action to get started.

Uploaded by

FatihRahman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta

ross Validated - Meta Stack Overflow

Let's gift wrap our (good) machine learning theory questions for Cross
Validated
Asked 2 years, 2 months ago Modified 1 year, 2 months ago Viewed 2k times

Machine learning (ML) theory questions are off-topic on Stack Overflow. There is no question
about this.
49
Stack Overflow also has many, many, many excellent Q&A's on ML theory. They often have
answers that are updated regularly (as machine learning is a developing science) and are an
excellent source of information for beginners in ML.

They are also, as previously stated, completely off-topic for modern Stack Overflow, and should
be closed under our moderation rules.

Being closed means they can't be updated anymore. Which is, in my opinion, actively harmful to
the goal of being a repository for knowledge. And there's a lot of knowledge there.

Those questions are, however, directly in the wheelhouse of Cross Validated.

Normally, a good, off topic-question should be migrated if there's an on-topic Stack Exchange
site. But questions older than 60 days are not allowed to be migrated - mostly because allowing
one site to push questions to another caused a lot of bad questions to be shuffled around for no
good reason. Not even site moderators can get around this, although SE Community Managers
(CMs) can.

I do sincerely feel that saving these questions would be a good reason.

I've posted on Cross Validated meta at the suggestion of some Stack Overflow moderators,
and the Cross Validated community seems to agree.

I want to raise this here so that people can:

1. Nominate, collate, and assist in closing good, but off-topic ML theory questions to hand off
to Cross Validated, in cooperation with Stack Exchange CMs.

2. Allow an open venue for discussion of whether such questions should be migrated at all, in
case anyone has some issue with the concept.

discussion closed-questions migration

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 1/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

Share Follow edited Jan 30, 2021 at 9:14 asked Jan 28, 2021 at 11:33
iBug Daniel F
34.9k 1 49 84 13.5k 13 18

4 Cross Validated or Artificial Intelligence, though? There's a lot of overlap between those two sites, we might
want to ask their meta's what they think on this issue, but ML theory is one of the main topics on Artificial
Intelligence as well afaik – Erik A Jan 28, 2021 at 12:32

5 @ErikA my impression of the AI/ML split is AI is the "why" and ML is the "how" - and the top questions on
ai.se sort of bear that out., Especially when compared to Cross Validated – Daniel F Jan 28, 2021 at 12:47

1 @DanielF That's very much a function of the "why" questions being disallowed on Cross Validated, not the
"how" questions being unpopular on AI. Also note that AI is a fairly recent site, so it can't address most of
the basic "how" questions since they've already been addressed elsewhere. My main point is that before we
start moving stuff over to either site, it might be a good idea to involve the community on the target site,
because we can easily see it doesn't fit in here, but they should judge if it fits in at their site and if there are
potential side-effects (e.g. dupes existing). – Erik A Jan 28, 2021 at 12:52

17 Machine learning Qs are on topic on Cross Validated, they have been for just about forever. It's an older,
larger, & more active site. Questions about the theory & philosophy of AI are off topic & are a better fit for
AI, but the examples highlighted in this question (& most of the ML Qs I've seen on SO) are a better fit for
CV. SE is full of overlapping sites, & there may be a couple Qs that would do better at Theoretical
Computer Science, etc, but the overwhelming majority will do best on CV. For the sake of making this
workable, we should do the simplest thing 1st & make it more complicated later.
– gung - Reinstate Monica Jan 28, 2021 at 12:52

15 @ErikA, the community on the proposed target site has been involved & agrees with this proposal.
– gung - Reinstate Monica Jan 28, 2021 at 12:54

What about asking the OPs of the questions and answers about this? Some might not be happy if a large
portion of their reputation and privileges are migrated away. – user000001 Jan 28, 2021 at 13:57

2 @user000001 That's a reasonable counterargument (and I'd imagine is the source of most pushback).
– Daniel F Jan 28, 2021 at 14:19

5 @user000001 Although it appears that questions migrated after 60 days don't lose rep (becasue rep is
frozen after 60 days and you can't lose it anymore)- one of the reasons it's normally not allowed, as it
means basically duping rep. – Daniel F Jan 28, 2021 at 14:47

4 About possible migration target ai.se: there seems to be a policy of only reluctantly migrate to sites in
public beta. The machine-learning tag on CV have 16K Q and about 1200 followers, on ai.se 1.7k Q and
about 220 followers. The all-time answering rate is higher on CV, but the difference is small.
– kjetil b halvorsen Jan 28, 2021 at 18:53

3 Since ai.se was brought up, I feel datascience.se should also be mentioned. All the questions listed so far
are a better fit for CV than DS in my opinion (that of course is subjective given the overlap), but perhaps
you will find some others that would fit better at DS. – Ben Reiniger Jan 28, 2021 at 20:07

5 @BenReiniger isn't we that are sending them, they are asking for them to be given to them. If datascience
wants them, they can ask for them. This is a fire sale, 2 non-programming questions free for getting 1.
– Braiam Jan 28, 2021 at 22:50

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 2/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

2 "Machine learning (ML) theory questions are off-topic on Stack Overflow. There is no question about this."
Yet the answer you link doesn't support this statement. – Cris Luengo Jan 31, 2021 at 5:47

1 Any updates on this, about 1 year later? I, a moderator at Cross Validated, am interested to get the ball
rolling on this. What needs to happen to get this initiative underway? – Sycorax Feb 3, 2022 at 0:18

This is important for the DS SE too. A moderator election will be launched in june on the DS SE. I feel like
clarifying the perimeters of the different exchange will be the main driver for community and moderators
involvement in the DS SE. – lcrmorin May 25, 2022 at 9:54

Another remark is that some of the main ML packages (at least tensorflow and lightgbm) now route to SO
by default - on top of answering questions on their github. It might not be entirely clear for the users
where to post their questions. Might be worth to take that into account in any decision. And maybe to get
in touch with them. – lcrmorin May 25, 2022 at 9:58

Sorted by:
4 Answers
Highest score (default)

Here is a list of deleted questions having the machine-learning tag, with scores >= 3:

17 Overwhelmed by Machine Learning---is there an ML101 book?

What are good examples of solutions to neural network problems?

Machine Learning in Game AI

Is there a recommended package for machine learning in Python?


Do People Actually Use Machine Learning?

Anyone Recommend a Good Tutorial on Conditional Random Fields

Best books, blogs, link, reading about AI and machine learning

What are some popular OCR algorithms?


Neural networks - obsolete?

Machine Learning Library?

Fastest general machine learning library?

Where can I find a kdtree implementation?

Netflix prize dataset?

Mathematics for AI/Machine learning?

The business of Artificial Intelligence

Good source for machine learning datasets in computer vision

Justin Bieber detector

Scikit-learn equivalent for C++?

What training data sources could be used for sentiment classification models?
https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 3/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

How exactly is xgboost model boosting from the initialized predictions?

Techniques for building recommendation engines?

What are known uses of AI in web development?

How to increase low validation accuracy in Keras?

How to (systematically) tune learning rate having Gradient Descent as the Optimizer?

what's a good language for learning machine learning?

What are some practical applications for a single layer perceptron?

Doing a research about pragmatic chaos

How can I demonstrate machine learning skills?

Converting Keras model to Tensorflow implementation. Different results

Automated Legal Processing

Data has two trends. How to extract independent trendlines

Difficulty in applying particle swarm optimization in training : Weight matrix

voice recognition programming through java sphinx4

How to tweak XGBoost to give more weight to particular Predictor column


Python alternatives to Java LensKit

TESPAR Coding Method - how to generate the alphabet?

Pytorch: Visualizing Images that Maximally Activate a Neuron

How does Decision Tree with Gini Impurity Calculate Root Node?

BiDirectional Lstm: Accuracy not going above 55%

Statistical Machine Learning Books

Compute FPR with pybrain is it possible, if yes the how?

Is there a python implementation for the SMOTE algorithm?


Gibbs sampling for a simple linear model -- need help with the likelihood function

DataSet for News Recommendation System

application of AI/neural networks/machine learning in stock market trading: looking for a


book(s)

Clustering algorithm where a document can be in more than one cluster

training a hidden markov model with unknown states?

pattern recognition entrance exam questions?

Multi-label tweet classification python nltk

PyTorch Network Training. Tensorflow Network is Not. Cant Spot Difference

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 4/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

Books on Machine Learning

Multi label Classification in spark

Implementing Convolutional Neural Network - Problems


Algorithms and methods for attribute/feature selection?

Statistics vs Machine learning and which Java API to use

How to Extract Titles from text file using python

Ontological databases for professions and skills

Weights after training neural network are all negative

Anomaly Detection: What algorithm to use?

Suggestion needed to learn Machine Learning and Information Retrieval

Matlab : Stuck in how to apply theoretcial concept in learning by Self Organizing Map
How to provide an upper bound for this simple algorithm

OpenCV 3.1 : Training SVM classifier (SURF+BOF)

Machine learning algorithms and their advantages and disadvantages

How to build a predictive model that uses only a subset of training factors , for testing?

Inverse of positive definite matrix in matlab contains negative elements?

Can Machine learning solve this issue

Where am I going wrong in implementing the log-likelihood expression using Matlab

Text classification with Decision Trees and spark Dataframes

How to determine the language of the text?

Decision trees vs. Neural Networks

preprocessing audio in tensorflow

TypeError: Cannot convert a symbolic Keras input/output to a numpy array when using
custom loss taking model internals Tensorflow 2

Matlab : Confusion regarding application of k-nearest neighbor search in information


retrieval

Implementing LSTM backpropagation from scratch using Numpy

Cannot Find Tensor in Metagraph in Tensorflow to Freeze Graph

Error (OOM) when using custom loss function in Keras


How to differntiatethe give recommendation depending on id

Gaussian mixture models and K-Means

Can a neural network provide more than "yes" or "no" answers?

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 5/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

Use CNN with two inputs for prediction

Error when loading pickled model : AttributeError: Can't get attribute 'Stemmed' on <module
'main' from

What are the differences between Ridge regression using R's glmnet and scikit-learn?

Unable to implement pipeline in image classification problem

Training, validating and testing sets have >90% accuracy but using model.predict always
guesses the same output class

Which data fitting model to use for this problem

How to switch from multiclass to multi label classification?

Keras custom regularization behavior using outputs of intermediate layers

It is easy for moderators to access this list, but less easy for regular users. However, regular users
can still get to the information using SEDE, so it's not as if this post is leaking anything
confidential.

I provide this list because the claim has been floated that deletion of off-topic machine learning
questions on Stack Overflow may be destroying value.

Note that it has not been filtered by who deleted the question, so this will include self-deleted
questions. Nor is it filtered by date of deletion, so several of these were deleted many years ago.

Going below a score of 3 gets into a very long tail, almost certainly with a majority of
uninteresting questions, so I've omitted those.

Share Follow edited Jan 28, 2021 at 23:22 community wiki


2 revs
Cody Gray

There seem to be some migrated posts here, too; now I'll be damned if I could figure out how the heck
Decision trees vs. Neural Networks found its way to... Software Engineering (of all places)! :-0 – desertnaut
Jan 29, 2021 at 0:23

4 When that was migrated 8 years ago, Software Engineering was not the place it is today. It was more like a
dumping ground for questions that didn't fit on Stack Overflow. – Cody Gray Mod Jan 29, 2021 at 0:38

1 Good to know! What you imply by the claim about destroyed value? I sampled ~ 20, and all were crap and
well-deleted. – desertnaut Jan 29, 2021 at 0:49

1 I didn't look at any of them, except the first one, to check that the script I used to extract and format the
links was working. There are no implications here, just information. – Cody Gray Mod Jan 29, 2021 at 0:55

1 If nothing else Neural Networks - obsolete? is a counterargument to the idea that "basic" answers will never
change. – Daniel F Jan 29, 2021 at 5:29

1 It's not "a matter of trade-offs" anymore, @Daniel? – Cody Gray Mod Jan 29, 2021 at 5:31

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 6/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

@CodyGray that's popular by virtue of being a neutral answer to an opinion question (and thus rightfully
closed) But the "right" answer to the SVM/NN decision has flip-flopped at least 3 times in the last 10 years.
– Daniel F Jan 29, 2021 at 5:53

I am afraid that whoever claims that ^^ has clearly no clue on what exactly they are talking about.
– desertnaut Jan 29, 2021 at 8:40

5 Of all the fears you've expressed, misinterpreting my statements seems to be the least of them. – Daniel F
Jan 29, 2021 at 9:40

I'll list up the gift basket here:

13 Confirmed:

What is the role of the bias in neural networks?

What is the difference between supervised learning and unsupervised learning?

Difference between classification and clustering in data mining?

Why must a nonlinear activation function be used in a backpropagation neural network?

What is the difference between a generative and a discriminative algorithm?

A simple explanation of Naive Bayes Classification

How does the Google “Did you mean?” Algorithm work?

Epoch vs Iteration when training neural networks


What is the difference between linear regression and logistic regression?

How to interpret “loss” and “accuracy” for a machine learning model

What is an intuitive explanation of the Expectation Maximization technique?

Why should weights of Neural Networks be initialized to random numbers?

Is there a rule-of-thumb for how to divide a dataset into training and validation sets?

Why does one hot encoding improve machine learning performance?

What is the difference between a feature and a label?

What are advantages of Artificial Neural Networks over Support Vector Machines? - Closed
for Opinion-based, OK for migration as per CV mod @Sycorax, but will be dupe of CV
question

multi-layer perceptron (MLP) architecture: criteria for choosing number of hidden layers and
size of the hidden layer? Closed for Opinion-based, OK for migration as per CV mod
@Sycorax

Disputed (to be confirmed with stats.se mods):

Which machine learning classifier to choose, in general? - Closed for Opinion-based


https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 7/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

When should I use genetic algorithms as opposed to neural networks? - closed for Opinion-
based

Nearest neighbors in high-dimensional data? - closed for lacks focus

How to understand Locality Sensitive Hashing? - closed for "seeking resources"

Candidates (Not currently closed, but possibly should be):

Why do we have to normalize the input for an artificial neural network?

Intuitive understanding of 1D, 2D, and 3D convolutions in convolutional neural networks

How does Apple find dates, times and addresses in emails?

What is the difference between value iteration and policy iteration?

What is the mAP metric and how is it calculated?

I would only consider proposing closed questions for migration, as I want it unambiguous that
there is a consensus by the SO community that the questions are not wanted or off-topic, lest
there be a precedent for one stack cannibalizing another's (desired, on-topic) questions. Recently
closed should be fine as long as there is apparent consensus.

EDIT: Starting the process of closing other theory-based questions. Candidate questions are not
currently closed, but probably should be. Will (slowly) bring them before SOCVR in addition to
listing them here.

Share Follow edited Feb 11, 2021 at 8:37 community wiki


8 revs
Daniel F

2 The middle two are closed for being opinion-based. Would they have been closed on CV for the same
reason? Are they rescued by having objective answers? – Ben Reiniger Jan 28, 2021 at 20:11

Anyone can feel free to edit and add questions they think would be appropriate under Candidates. I'll ask
to get converted to communitiy wiki to assure anyone can add – Daniel F Feb 10, 2021 at 13:06

This is only CV basked, correct? Have any other site shown interest? – Braiam Feb 10, 2021 at 13:14

@Braiam my current thought is to give CV first crack and then anything off-topic there offer up to Data
Science or AI. But I don't particularly want to adjudicate the destination here, if anything that can be
adjudicated when it all goes to MSE - they're the ones that have to do the migration at the end anyway, and
it's more on-topic there. – Daniel F Feb 10, 2021 at 13:24

1 I'm a CV mod. We have duplicate questions to stackoverflow.com/questions/11632516/… which are still


open. So, it's on topic on CV, but also it's a dupe of stats.stackexchange.com/questions/366581/… – Sycorax
Feb 11, 2021 at 2:35

1 stackoverflow.com/questions/4674623/… is squarely on-topic on CV. – Sycorax Feb 11, 2021 at 2:37

1 stackoverflow.com/questions/13610074/… is squarely on-topic on CV. – Sycorax Feb 11, 2021 at 2:38

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 8/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

1 stackoverflow.com/questions/42883547/… is squarely on-topic on CV. – Sycorax Feb 11, 2021 at 2:38

1 stackoverflow.com/questions/17469835/… is squarely on-topic on CV. – Sycorax Feb 11, 2021 at 2:38

1 stackoverflow.com/questions/10565868/… is also on-topic on CV. – Sycorax Feb 11, 2021 at 2:39

Thanks for your input @Sycorax. I'd be mostly interested in your opinions on "disputed" questions, as they
are the ones that are closed, but not as off-topic, and thus are in danger of being closed/deleted on CV if
migrated. If you say they would not be considered closable on CV, I'll move them to "confirmed." Also
perhaps the CV mods opinions on historically locking duplicate questions from SO should be re-iterated
here from the response on the MCV question – Daniel F Feb 11, 2021 at 8:41

I agree that stackoverflow.com/questions/12952729/… is a poor question -- it would be closed on CV unless


OP edited it to have a more specific question. stackoverflow.com/questions/5751114/… needs focus -- we
probably have questions that address each part of it individually. – Sycorax Feb 11, 2021 at 15:35

stackoverflow.com/questions/2595176/… and stackoverflow.com/questions/1402370/… would both be


closed as requiring focus. – Sycorax Feb 11, 2021 at 15:39

@DanielF Can you explain what you mean by "Also perhaps the CV mods opinions on historically locking
duplicate questions from SO should be re-iterated here from the response on the MCV question"? I don't
follow. – Sycorax Feb 11, 2021 at 15:40

Apart from Cross Validated (or Stats) Stack Exchange, there are other SE stacks/sites where these
questions would also be on-topic, such as Artificial Intelligence Stack Exchange and Data Science
9 Stack Exchange.

Now, the question is:

Why would you migrate these questions to Stats SE and not to AI SE or DS SE?

Stats SE is not a public beta, like AI SE, so that may be your explanation. However, the other
mentioned stacks have been around for a long time, and they also have a big enough number of
members that would be interested in these questions and, as far as I understand, sustainable
public betas are no at risk of being shut down. Moreover, here it's stated

All beta sites have the same temporary placeholder design. Once the site is no longer
beta, it will have a unique design built with input from the community.

AI SE no longer has the same "temporary placeholder design" as other public beta sites.
Additionally, from what a CM had once told me, it's definitely not at risk of being shut down,
although, technically, as far as I know, we are still a public beta.

I don't mind if most machine learning-related questions are migrated to Stats SE, but there are
also questions that could be migrated to AI SE (e.g. questions tagged with artificial-intelligence ,

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comm… 9/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

such as this one, reinforcement-learning or genetic-algorithm , which do not involve programming


issues/bugs, which would be off-topic on AI SE).

So, the issue you're raising is not just related to machine learning questions that need to be
migrated to Stats SE, but related to valuable questions that are now off-topic on Stack Overflow
but that would be more suitable for other SE sites. In my view, it's a good thing to migrate
questions that are off-topic here to other more suitable sites, so that SE sites start to focus on
specific topics and can build more cohesive communities, where people interested in the same
topics are grouped together.

In any case, the challenge remains: which other SE would you migrate these off-topic questions
to?

(Disclaimer: I am a moderator at AI SE.)

Share Follow edited Jan 30, 2021 at 22:24 answered Jan 29, 2021 at 18:41
nbro
15k 8 5

6 Note that DS is not in beta. – Ben Reiniger Jan 29, 2021 at 19:08

@BenReiniger Oh, I've always thought it was in beta because it still has the standard design for betas.
– nbro Jan 29, 2021 at 19:09

1 I would like to note that AI SE has graduated, so it's not anymore a public beta site. – nbro Dec 30, 2021 at
15:54

Disclaimer: I am a frequent contributor to machine-learning ; I hold a relevant gold badge and, as it


happens, I am currently ranked at #2 of all-time respondents in the tag. Given that, I guess one
8 could claim that I am kind of a ML subject matter expert (SME). That said, I have not posted any
answers in the threads you have linked here, so there is not any direct conflict of interest.

I think what you propose is not a good idea, which, moreover, seems to be based on false
premises. Before checking the premises, let me hereby repost the examples list from your sister
CV Meta post, for convenience of the reader:

1. What is the role of the bias in neural networks?

2. What is the difference between supervised learning and unsupervised learning?

3. Which machine learning classifier to choose, in general?

4. What are advantages of Artificial Neural Networks over Support Vector Machines?

5. Difference between classification and clustering in data mining?

6. Why must a nonlinear activation function be used in a backpropagation neural network?

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#com… 10/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

From what you say here and at your sister post at Cross Validated Meta, I understand your
premises as follows:

1. These questions, being closed as off-topic here, they "suffocate" by not being possible to
receive new answers; hence, they are not kept up to date with recent advances and the latest
research.

2. By being closed, they are in high danger of accidental deletion; moreover, there might be
some overzealous SO users out there determined to eventually remove (i.e. actually delete)
such off-topic but valuable posts, hence these questions need protection like some kind of
endangered species.

3. Ourselves, i.e. the SO community at large (including its moderators), are incapable of
providing this necessary protection, so we should better migrate them somewhere else
where they will be better looked after.

Let's take these premises one by one:

These questions, being closed as off-topic here, they "suffocate" by not being possible to
receive new answers; hence, they are not kept up to date with recent advances and the
latest research.

Not true. Even a superficial look at these questions by a trained eye (see disclaimer above)
reveals that they are about very basic topics and definitions (ML 101), which have been textbook
material since the 1990s at least. There are not going to be any updates whatsoever on the
definition of supervised and unsupervised learning, the difference between classification and
clustering, or the role of bias and nonlinear activation functions in neural nets. Not a single one of
these topics is cutting edge, and no post will suffer or lose value by not being able to stay
"updated". Given that most questions have anything between 15 and 36 answers, I think any
"diversity" requirement is already sufficiently provided, too.

By being closed, they are in high danger of accidental deletion; moreover, there might be
some overzealous SO users out there determined to eventually remove (i.e. actually
delete) such off-topic but valuable posts, hence these questions need protection like
some kind of endangered species.

Based on your comments elsewhere, it would seem that this is one of your main motivations; but
how does it stand up against the evidence?

5 out of the 6 questions you have linked here have exactly zero delete votes; there are a couple
of delete votes in one single question (which, interestingy enough, was single-handedly closed by
a mod as too broad, and not as off-topic), but other than that, nothing.

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#com… 11/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

Factoring in that these questions and the answers therein have a very large number of upvotes
(also obviously used as a proxy for their value), it would take no less that ten (10) votes from
users with more than 10K rep to actually delete them. This puts the probabilities of something like
that happening by chance to a negligible value (most of them have survived well for at least a
decade now). I have voted for closing several of them myself as off-topic, but it never crossed my
mind to vote for deletion of such existing valuable stuff.

Now, if you have any evidence of death squads with >= 10 members of >= 10K rep sneakily
moving around and taking aim to exterminate valuable stuff in machine-learning (or any other tag,
for that matter), I would seriously suggest you share it with us here. Until then, it would certainly
seem that this premise of yours is also not true.

Ourselves, i.e. the SO community at large (including its moderators), are incapable of
providing this necessary protection, so we should better migrate them somewhere else
where they will be better looked after.

Now, I will admit that this is perhaps my biggest issue here: the act you propose and its rationale
implicitly but clearly depict us, the SO community at large, as some dangerous irresponsible
stupid folks, who are so useless that they cannot be trusted to preserve their own valuable (albeit
legacy and currently off-topic) stuff, so they need to hand it over in order to save it. And I like to
believe that this, too, is not true. We can protect such stuff from deletions, either accidental or
malicious ones.

I will argue that such questions are part of our history and our legacy. And if we seriously think
of handing over our legacy, we'd better do it for solid reasons, which I have yet to see.

I will additionally argue that, on top of the general SO rules, there are two general guidelines that
we should keep in mind in such discussions:

1. try to not deliberately remove existing value from the site

2. keep some valuable stuff here for historical reasons, however off-topic it may be today

The buy-in from Cross Validated mods is hardly a surprise: they get to get our creme de la creme
stuff back from 2009 and 2010 for free, while we stay back and still have to handle the piles of
incoming crap (here, not there) on a daily basis.

These questions are part of our legacy and our history. They are remnants from days gone,
back from 2009 and 2010, where SO was the only place where one could ask such questions and
expect a decent answer, and sites like Cross Validated did not even exist yet.

What will be their fate if migrated? And should we care? I argue we should, indeed. What will
happen to What is the difference between supervised learning and unsupervised learning?, asked
in 2009, with all its glorious upvotes, if migrated to CV? Will it eventually be closed as a duplicate

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#com… 12/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

of their own existing thread on the same subject, asked back in 2010? Or the opposite will
happen? Does any of these outcomes sound fair and satisfactory? I think not. Should we care at
all?

We can and should care about our own history and legacy. We can and should keep it here, and
take care of it. Ourselves. Fondly.

After all, if we don't do it, who will?

Who, really? :(

Share Follow edited Jan 28, 2021 at 14:00 answered Jan 28, 2021 at 13:54
desertnaut
56.7k 1 18 25

15 I honestly don't get the draw of "let's strangle these questions for being off-topic and then lovingly admire
their corpse" but I guess I'm not personally invested enough. – Daniel F Jan 28, 2021 at 15:14

11 I'm not sure if you're joking or not, because highly upvoted content gets deleted all the time, in fact users
raise flags to beg moderators to preserve value from deletion with a lock on the content. – Scratte Jan 28,
2021 at 15:41

9 "These questions are part of our legacy and our history" our (SO) legacy and history is one of sucking, you
know? Of hard lessons and much drama. We've already done this several times in the past, with non-
programming (later programmers, later software engineering), code reviews, game development, code golf,
etc. – Braiam Jan 28, 2021 at 15:54

6 @Scratte the stuff you link to is very uneven, including posts deleted by OPs themselves and/or by
moderators. That said, I think that locking questions such as the ones we are talking about here is indeed
the ideal solution (no need even to coordinate anything with 3rd parties), instead of packing them and
sending them away wholesale. – desertnaut Jan 28, 2021 at 16:00

15 I really disagree with this post, albeit as someone completely outside the bubble of SMEs on this topic.
Moving knowledge that's off-topic here altogether to a place that would (presumably) take better care of it
than we do, sounds like an excellent way of preserving our knowledge base and legacy, because I think that
if we really care about the knowledge in our base, we'd be eager to preserve it if it doesn't fit here
anymore; this is exactly what this proposal does, by giving it a safer, more fitting home. – zcoop98 Jan 28,
2021 at 16:16

26 Holding on to scraps of off-topic knowledge just because they’re “part of our legacy” when they’d fit better
somewhere else sounds like hoarding to me, not preserving. This also fully misses that fact that
CrossValidated is building a knowledge base of their own, to which this would apparently be a great
addition (according to their mods). Why is our repository more important than theirs? – zcoop98 Jan 28,
2021 at 16:16

12 Maybe I'm just unusually unsentimental, but being "part of our legacy and our history" just ... doesn't seem
like a good reason to keep it at all. You could say the same about the probably millions of garbage posts
we've deleted over the years and no-one's fighting for us to keep all of those around. If you're happy with
those getting deleted, which are equally "part of our legacy", but not these, then the distinction presumably
has something to do with these posts having value outside of being part of a legacy. So really the
argument should be about that intrinsic value. – NotThatGuy Jan 28, 2021 at 19:33

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#com… 13/14
4/21/23, 11:23 AM Let's gift wrap our (good) machine learning theory questions for Cross Validated - Meta Stack Overflow

3 @NotThatGuy the whole argument (both in the question and in my answer, we agree on that) presumes
that there is value in these posts, despite being off-topic here and now; if they were garbage, we would just
delete them. – desertnaut Jan 28, 2021 at 19:44

9 @desertnaut But you're arguing that the questions are "part of our legacy", and it's a big deal to damage
that legacy. I disagree. We shouldn't care about that at all. Stack Exchange is a network of Q&A sites, not an
archive, and especially not an archive of it's own legacy. A historical lock might sometimes be applied to
questions which are particularly valuable, but this should be the exception rather than the norm. What
should be most important is where the questions will have the most value and best serve the wider world. If
we wanted to archive our legacy, there would be better ways. – NotThatGuy Jan 28, 2021 at 20:11

6 Regarding updates, there probably won't be any, but it's not impossible that someone comes up with some
new technique that makes a definition different for different techniques or just changes it to some degree,
or even just that someone comes up with a great new way to explain it. Regarding deletion, it is fair game
to delete anything that's closed (and not locked), whereas it's not possible for regular users to delete
anything that's open, so that's at least a risk. Closed questions that's intended to remain relevant and
applicable to the present is really not ideal. – NotThatGuy Jan 28, 2021 at 21:20

@NotThatGuy also, the lock should be self expiring. It's very unlikely that in years, nobody asked elsewhere
the same question and got same or better answers. – Braiam Jan 28, 2021 at 23:03

1 "Even a superficial look at these questions by a trained eye (see disclaimer above) reveals that they are
about very basic topics and definitions (ML 101), which have been textbook material since the 1990s at
least. There are not...." I disagree with this argument. ML, like most scientific fields, is constantly evolving.
Some recent advances have changed the way we think about foundational topics in ML. Some future
advances could even affect answers to the questions on your list. Also, your list is only a small sampling of
the ML questions that have been closed for being off-topic. – robguinness Feb 11, 2021 at 16:48

https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#com… 14/14

You might also like