Mandolin Task ChatGPT Search
Mandolin Task ChatGPT Search
Write it in non bullet points and write it as if it as an ask, such as it begins with
"develop a....."
can we except an out-of-box code (copy-paste and it will work fine) response
from this prompt?
Response 2
Question : How well did the two responses do? * Reminder: if both responses are
strong, please modify the prompt to make it more challenging, and try again! 1.
One is Good. One is Poor 2. Both Responses are Poor Only answer the question
and explanation in 1 sentence
How many times did you rewrite the prompt before you achieved at least one bad
response? Rate in scale of 10
Select the steering constraints used in the system prompt and the prompt you
wrote
Label your response according to the questions below.
Steering constraints
So this is response 1
Now i will ask some questions related to it , answer that in simple points in
numbered list format
tell me all the points where the response strictly adheres and does not adheres to
the constraints in the prompt in simple points
If the score is not a 5, write why you chose this rating (DO NOT simply write out
your calculation instead write why you think each instruction was followed or not
followed) in simple points
Accuracy
From the instruction following requirements, list out the aspects that are
CORRECT and incorrect. They will fall into the following buckets:
Case coverage: The code handles a wide range of possible input scenarios, edge
cases, and error conditions, ensuring that it behaves correctly and robustly.
If the score is not a 5, write why you chose this rating (DO NOT simply write out
your calculation instead write why you think each instruction was followed or not
followed) in short simple points
Optimality/Efficiency *
For this specific criteria, please evaluate the time complexity of the solution. Is it
an optimal solution? eg. instead of 3 loops, we can have a solution with 1 loop.
From the instruction following requirements, list out the aspects that are
OPTIMAL/EFFICIENT and OPTIMAL/INEFFICIENT. They will fall into the following
buckets:
If the score is not a 5, write why you chose this rating (DO NOT simply write out
your calculation instead write why you think each instruction was followed or not
followed)
in short simple points
Just provide correct and incorrect points in short and simple points
If the score is not a 5, write why you chose this rating (DO NOT simply write out
your calculation instead write why you think each instruction was followed or not
followed) in simple points
Up-to-date
Write a numbered list of all the correct and incorrect up-to-date standards,
function usage, and imported packages identified. Note: do this at a
function/package level.
Mark “Yes” if there is any new or altered code present in the response, regardless
of whether it is an entire program or just a code snippet. Does not apply to new
comments!
1. Yes
2. No
Just answer
Template
Partial Update
Function Update
Out-of-the-Box
NA
What installation commands are necessary to test the code? *
Please separate each individual command with a comma (,) and write ‘N/A’ if
there are no extra install commands required.
What commands are necessary to run the code? Provide a comma separated list
if there are multiple commands. *
Yes
No
Response 2
tell me all the points where the response strictly adheres and does not adheres to
the constraints in the prompt
in simple points
If the score is not a 5, write why you chose this rating (DO NOT simply write out
your calculation instead write why you think each instruction was followed or not
followed)
in simple short points
Response 1
Response 2
Just answer
Please provide a justification. Use "@Response 1" and "@Response 2" to refer to
the model responses.
in short 50 words
Are there any minor or major stylistic or presentational issues in the response
you selected as the "better" one? *
Your are REQUIRED to perform a rewrite for any stylistic or presentational issues!
Make sure that the response meets all stylistic standards, including:
- Use of bullet points when there are three or more key points for easy readability,
with paragraphs broken down logically.
- For trivial html/css code, the requirement for comments can be more lenient
[YES] - I will perform a rewrite and change the response to follow all stylistic and
presentational requirements
[NO] - I confirm that the "better" response meets all stylistic requirements
Just answer
Mark “No” if you believe the selected model response fully addresses the prompt
and follows all presentation and style requirements.
- Not performing a rewrite when one is required will heavily affect your score for
the task.
Yes
No - the selected response is perfect and does not have any issues