Milestone 5 Validation and Testing
Milestone 5 Validation and Testing
Deadline: December 8
Objective: Well done! You have now implemented all the requested features. But as
a software engineer, the next important step is to validate the code you have written. In
this milestone, we ask you to validate the implemented functionality of your Android apps
in two ways---some should be tested with test cases generated by LLMs (Large
Language Models), and others should be tested by writing "instrumented tests". Per
several requests during the semester, we have also considered some bonus activities in
this milestone.
* We are considering an extension with partial credits for the teams that lost points in
Milestones 3 and 4 due to missing feature implementation (excluding points deducted
for missing comment, missing weekly reports or missing peer evaluations). If your team
has failed to get a full score in Milestone 3 or 4, you can implement the missing features
until December 4th with a -30% penalty. Such teams should provide a separate
document explaining (1) what features are missing and implemented now, and (2)
mention the explicit commit number or link to the pull request corresponding to the
implementation. The team should also resubmit a new video recording demonstrating
the features are implemented and that the app is functional. We will not regrade the
requests that fail to provide the requested information.
Deliverables: To finish the last milestone, please follow the tasks explained under the
“Tasks” section. Your deliverables are: (1) the “test code” pushed in the team’s GitHub
repository, (2) a simple report of the LLM-generated tests in the format of a pdf file,
including the information of what LLM is used, how you accessed it, the screenshots or
text of the full conversation with the LLM to generate the tests, and how you leveraged
its response to create test cases (i.e., did you make any changes to make the generated
tests compilable/runnable?), (3) the scripts of calling LLMs pushed in the team’s GitHub
repository (if you choose the first way in Task 2), (4) a video (no more than five minutes)
that captures the entire test suite execution and shows that all the tests pass, and (5) the
coverage report of test execution (optional with bonus points).
All tests should include assertions to check the behavior of the application against a
ground truth. Failing to include assertion results in losing half of the points for the
corresponding test.
* Bonus points can be given to the entire team (in case of a team effort) or individual
team members willing to get additional credits by doing the mentioned tasks alone.
** Note that those team members that fail to submit the peer evaluation form will lose
10 points compared to their teammates. We accept no peer evaluation form
submissions after the deadline. We accept no request for regrading if you fail to
properly submit the form (forgetting to submit the form, wrong team number, and using
a non-Illinois email address) .
*** Students reported by their peers not helping the team to progress on this milestone
will lose “at least” 50% of the total grade.
Note: Your TA will run the test suite on their local machine to reproduce the captured
video. In case of discrepancy, i.e., being unable to run the tests or observing failed tests,
you will lose the entire points of this milestone.
Tasks:
While there are several test harness platforms for instrumented test development in
Android, we specifically ask you to develop “ Espresso” instrumented tests:
https://developer.android.com/training/testing/espresso
You are required to generate test cases using LLMs to test the first two functionalities
(user sign up, user login).
You are free to use any LLMs you prefer. There are two primary ways to interact with
LLMs and generate test cases. You can choose either of them:
1. Write scripts to call LLM APIs. There are a lot of tutorials available regarding how
to access LLMs and call them in your code. For example, OpenAI API Tutorial
(https://platform.openai.com/docs/api-reference/introduction) and Hugging Face
Tutorial (https://huggingface.co/docs/transformers/llm_tutorial). Feel free to find
any other resources you think are useful to you.
2. Interact with the LLMs through their websites. For example, you can throw your
prompts in OpenAI Playground
(https://platform.openai.com/playground/chat?models=gpt-4o) or ChatGPT
(https://chatgpt.com) directly and then get the response.
Both of these two ways require detailed prompts about scenarios. LLM cannot write valid
code/tests with user-defined variables/functions if not provided with relevant information
in the prompt. That means instead of writing a simple standalone prompt, you need to
collect useful information (e.g., function specification, function implementation, variable
definitions, class context, etc.) from the code base and integrate all of them, including
your requirements, into your prompts. For the basic prompt design, you may refer to the
following case:
The figure above shows an example of writing a basic prompt for the getEnvironment
method and using ChatGPT for generation. It consists of two parts: (i) the natural
language description part (i.e., NL part) that explains the task to ChatGPT, and (ii) the
code context part (i.e., CC part) that contains the focal method and the other relevant
code context.
a. CC Part. It includes the following code context: (i) the complete focal method,
including the signature and body; (ii) the name of the focal class (i.e., the class
that the focal method belongs to); (iii) the field in the focal class; and (iv) the
signatures of all methods defined in the focal class.
Please note that the requirements of test cases are consistent with Task 1.
Android studio has a built-in feature to collect coverage information of the “unit test.”
However, it is possible to integrate JaCoCo code coverage collection tool with Android
Studio to generate coverage reports for the instrumented tests. Create a separate
pull request on the team’s GitHub repo for this bonus task and upload the coverage
report generated as a result of running the test suite to the repo.
<<Peer Evaluation>>
Each team member should submit the following form for the peer evaluation for
Milestone 5 (we accept no peer review for this milestone after the milestone deadline):