0% found this document useful (0 votes)
16 views5 pages

Data Analytic MCQ

Uploaded by

sandippatra567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views5 pages

Data Analytic MCQ

Uploaded by

sandippatra567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

 Select which of the following is one of the largest boost subclass in boosting?

gradient boosting(Y)

 Select Which of the following is the most important language for data science?
o R
 select Which of the following data mining technique is used to uncover patterns in data?
o data dredging(Y)
 The goal of ___________ is to focus on summarizing and explaining a specific set of data
o descriptive statistics(Y)
 Non-overlapping categories or intervals are define as ______
o mutually exclusive(Y)
 Focusing on describing or explaining data versus going beyond immediate data and making
inferences is the difference between _______
o descriptive and inferential(Y)
 Select Which of the following mentioned standard probability density functions is applicable
to discrete random variables?
o poisson distribution(Y)
 The denominator (bottom) of the z-score formula is defined as
o the standard deviation(Y)
 If a test was generally very easy, except for a few students who had very low scores, then
the distribution of scores would be defined as _____
o negatively skewed(Y)
 Select Which of the following approach should be used to ask data analysis question?
o find out the question which is to be answered(Y)
 Select What is the mean of this set of numbers: 4, 6, 7, 9, 2000000?
o 400005.2(Y)
 Select When do the conditional density functions get converted into the marginally density
functions?
only if random variables exhibit statistical independency(Y)
 select which of the following is the advantage/s of decision trees?
all of the mentioned(Y)
 Identify what is the purpose of performing cross-validation?
both to assess the predictive performance of the models and to judge how the trained
model performs outside the sample on test data(Y)
 Explain how can you prevent a clustering algorithm from getting stuck in bad local optima?
use multiple random initialization(Y)
 You run gradient descent for 15 iterations with a=0.3 and compute J (theta) after each
iteration. You find that the value of J (Theta) decreases quickly and then levels off. Based on
this, select which of the following conclusions seems most plausible?
a=0.3 is an effective choice of learning rate(Y)
 If P(x) = 0.5 and x = 4, then calculate E(x) =?
2(Y)
 A fair six-sided die is rolled twice. Calculate What is the probability of getting 2 on the first
roll and not getting 4 on the second roll?
5/36(Y)
 Suppose you have trained a logistic regression classifier and it outputs a new example x with
a prediction ho(x) = 0.2. This determine
our estimate for P(y=0 | x)(Y)
 For t distribution, increasing the sample size, the effect will be apply on
all of these(Y)
 In random experiment, observations of random variable are classified as ___________
events(Y)
 Select the following When do the conditional density functions get converted into the
marginally density functions?
only if random variables exhibit statistical independency(Y)
 select Which of the following is defined as the rule or formula to test a null hypothesis?
test statistic(Y)
 The point where the null hypothesis gets rejected is describe as?
critical value(Y)
 Predict which of the following are universal approximators?
all of these(Y)
 Select , What is the function of a post-test in ANOVA?
describe those groups that have reliable differences between group means(Y)
 The process of constructing a mathematical model or function that can be used to predict or
determine one variable by another variable is describe as
regression(Y)
 In the regression equation Y = 75.65 + 0.50X, the intercept is defined as
indeterminable(Y)
 The probability of type 1 error is describe as?
Α(Y)
 Statement 1: It is possible to train a network well by initializing all the weights as 0
Statement 2: It is possible to train a network well by initializing biases as 0 choose which of
the statements given above is true?
statement 2 is true while statement 1 is false(Y)
 Large values of the log-likelihood statistic indicate
that the statistical model is a poor fit of the data(Y)
 Predict which of the following would have a constant input in each epoch of training a deep
learning model?
weight between input and hidden layer(Y)
 Write what is stability plasticity dilemma?
dynamic inputs and categorization can not be handled(Y)
 Determine drawbacks of template matching ?
highly restricted(Y)
 write what is true regarding back propagation rule?
actual output is determined by computing the outputs of units for each hidden layer(Y)
 Write what is meant by generalized in statement back propagation is a generalized delta
rule?
because delta rule can be extended to hidden layer units(Y)
 Write how are input layer units connected to second layer in competitive learning networks?
feed forward manner(Y)
 Judge which of the following is/are true about random forest and gradient boosting
ensemble methods? 1.Both methods can be used for classification task 2.Random forest is
use for classification whereas gradient boosting is use for regression task 3.Random forest is
use for regression whereas gradient boosting is use for classification task 4.Both methods
can be used for regression task
1 and 4(Y)
 In random forest you can generate hundreds of trees (say T1, T2 ....Tn) and then aggregate
the results of these tree. Determine Which of the following is true about individual (Tk) tree
in random forest? 1.Individual tree is built on a subset of the features 2.Individual tree is
built on all the features 3.Individual tree is built on a subset of observations Individual tree is
built on full set of observations
1 and 3(Y)
 Select which of the following algorithm would you take into the consideration in your final
model building on the basis of performance? Suppose you have given the following graph
which shows the ROC curve for two different classification algorithms such as random forest
(Red) and logistic regression (Blue)
random forest(Y)
 In random forest or gradient boosting algorithms, features can be of any type. For example,
it can be a continuous feature or a categorical feature. Select which of the following option
is true when you consider these types of features?
both algorithms can handle real valued attributes by discretizing them(Y)
 Analyze ,The cell body of neuron can be analogous to what mathematical operation?
summing(Y)
 Explain what is the advantage of basis function over multi-layer feed forward neural
networks?
training of basis function is faster than MLFFNN(Y)
 Suppose you are using a bagging based algorithm say a random forest in model building.
Select which of the following can be true? 1.Number of tree should be as large as possible
2.You will have interpretability after using random forest
1(Y)
 Select what consist of a basic counter propagation network?
two feed forward network with hidden layer(Y)
 Categorize, The process of adjusting the weight is known as
learning(Y)
 Explain how do you handle missing or corrupted data in a dataset?
all of these(Y)
 Test ,In which of the following cases will k-means clustering fail to give good results? 1) Data
points with outliers 2) Data points with different densities 3) Data points with non-convex
shapes
1, 2 and 3(Y)
 Evaluate Which of the following scenario prefers failover cluster instance over standalone
o instance in SQL server?
high availability(Y)
 Predict,The resources owned by WSFC node include ___________
one file share resource, if the FILESTREAM feature is installed(Y)
 Measure,A windows failover cluster can support up to how many nodes
16(Y)
 Select an exciting new feature in SQL Server 2014 is the support for the deployment of a
failover cluster instance (FCI) with ___________
cluster shared volumes (CSV)(Y)
 Select which of the following is a windows failover cluster quorum mode?
node majority(Y)
 Recommend ,benefits that SQL server failover cluster instances provide ____________
all of the mentioned(Y)
 Select which of the following argument is used to set importance values?
set(Y)
 Predict the correct statement
all z nodes are ephemeral, which means they are describing a “temporary― state(Y)
 To register a watch on a z node data, write, what commands you need to use to access the
current content or metadata.
stat(Y)
 Propose which of the following has a design policy of using zoo keeper only for transient
data
hbase(Y)
 Write which of the following specifies the required minimum number of observations for
each column pair in order to have a valid result?
min_periods(Y)
 All of the following accurately describe Hadoop, except
real-time(Y)
 write what are the V of big data?
all of these(Y)
 Write what are the different features of big data analytics?
all of these(Y)
 Write, Facebook tackles big data with _______ based on Hadoop
project prism(Y)
 Write what is a unit of data that flows through a flume agent?
event(Y)
 Validate, As companies move past the experimental phase with Hadoop, many cite the need
for additional capabilities, including _______________
improved security, workload management and SQL support(Y)
 Predict, what was Hadoop named after?
The toy elephant of Cutting’s son
 Write,When a job tracker schedules a task is first looks for
a node with empty slot in the same rack as data node(Y)

You might also like