Week 7
Week 7
Answer:
Answer:
Answer:
● Incorrect: Using a single-node system is not suitable for big data scenarios
because it does not scale well with large datasets. MapReduce is specifically
designed to work with distributed systems to handle large-scale data processing.
B) By distributing the data and computations across multiple nodes for parallel
processing:
● Correct: In a big data scenario using MapReduce, building a decision tree model
involves distributing the data and computations across multiple nodes to leverage
parallel processing. This approach allows for efficient handling of large datasets
by dividing the work among several nodes in a cluster. Each node processes a
portion of the data, and the results are aggregated to construct the final decision
tree model.
● Incorrect: Manual sorting of data is not a typical requirement for decision tree
algorithms in a MapReduce framework. MapReduce handles data partitioning
and sorting automatically during the Map and Reduce phases.
● Incorrect: In-memory processing on a single machine is not practical for big data
scenarios due to memory limitations and scalability issues. MapReduce
processes data distributed across multiple nodes, which is essential for handling
large datasets effectively.
4. In Apache Spark, what is the primary purpose of using cross-validation in
machine learning pipelines?
Answer:
Answer:
● Incorrect: Gradient boosting and gradient descent do not necessarily use large
step sizes. In fact, gradient boosting typically uses a smaller learning rate (step
size) to ensure that the model converges slowly and avoids overfitting. Gradient
descent can also use various step sizes, which are often tuned to balance
convergence speed and stability.
Correct: Gradient boosting and gradient descent both use the concept of gradients to
iteratively improve a model. In gradient boosting, new trees are added to the ensemble
in a way that corrects the residual errors of the existing model, effectively adjusting the
model to minimize the loss function. Each new tree is built to fit the negative gradient of
the loss function, which is akin to taking steps in the direction of the steepest descent to
reduce the overall error. Similarly, in gradient descent, the algorithm iteratively adjusts
the parameters of a model by moving in the direction of the gradient of the loss function
to find the minimum value.
● Incorrect: Gradient boosting does not inherently rely on random sampling for
updating the model; it builds trees sequentially to correct residuals. Gradient
descent may involve stochastic or mini-batch sampling in its variants (such as
stochastic gradient descent or mini-batch gradient descent), but this is not a
direct conceptual similarity to gradient boosting.
D) Both techniques use a fixed learning rate to ensure convergence without
overfitting:
● Incorrect: While gradient descent may use a fixed learning rate, gradient
boosting typically uses a smaller learning rate as a regularization strategy to
ensure gradual convergence and reduce the risk of overfitting. The learning rate
is not fixed in the same sense for both methods; it is adjusted based on the
context and specific implementation details.
Answer:
● Incorrect: Decision trees can handle large datasets, but the computational
resources and time required may increase with the size of the dataset. While
they can be computationally intensive for very large datasets, they are scalable,
and there are techniques and implementations designed to handle large-scale
data.
D) Decision trees require a fixed set of features and cannot adapt to new feature
interactions during training:
● Incorrect: Decision trees do not require a fixed set of features. They can adapt
to new feature interactions dynamically as they grow. The structure of a decision
tree is built by evaluating all available features to find the best splits at each
node.
Answer:
Social Media Analytics (A): Incorrect: Social media analysis is a significant driver for
graph technology adoption, but it's a specific application area benefiting from the
capabilities of graph engines.
Machine Learning (B): Incorrect: Machine learning algorithms can benefit from graph
data and graph computations, but the development of graph engines is not solely driven
by advancements in machine learning.
Answer:
● This is partially true but not entirely accurate. Bagging is used to reduce variance
by averaging predictions, but it is a more general technique that can be applied
to various models, not just decision trees.
● This is incorrect. Bagging primarily aims to reduce variance rather than bias. In
fact, while bagging can indirectly help reduce bias in some cases, its main benefit
is in reducing the model's variance.
c) Correct - Bagging, short for Bootstrap Aggregation, is a general method for
averaging predictions of various algorithms, not limited to decision trees, and it
works by reducing the variance of predictions.
Explanation:
Answer:
D) Incorrect - They eliminate the need for data preprocessing: Decision trees still
require data preprocessing, such as handling missing values and feature scaling.
A) Feature scaling
B) Data shuffling
C) Data partitioning
D) Model pruning
Answer:
C) Correct - Data partitioning: is the technique used to manage the data that needs
to be split across different nodes in a MapReduce implementation of a regression
decision tree.