4 - Adressing Data Mismatch
4 - Adressing Data Mismatch
• Perform manual error analysis to understand the error differences between training,
development/test sets. Development should never be done on test set to avoid overfitting.
• Make training data or collect data similar to development and test sets. To make the training data
more similar to your development set, you can use is artificial data synthesis. However, it is
possible that if you might be accidentally simulating data only from a tiny subset of the space of
all possible examples.