0% found this document useful (0 votes)
203 views6 pages

Milestone 5 Validation and Testing

CS427

Uploaded by

jimpikkin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
203 views6 pages

Milestone 5 Validation and Testing

CS427

Uploaded by

jimpikkin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Validation and Testing

Deadline: December 8

Objective: Well done! You have now implemented all the requested features. But as
a software engineer, the next important step is to validate the code you have written. In
this milestone, we ask you to validate the implemented functionality of your Android apps
in two ways---some should be tested with test cases generated by LLMs (Large
Language Models), and others should be tested by writing "instrumented tests". Per
several requests during the semester, we have also considered some bonus activities in
this milestone.

* We are considering an extension with partial credits for the teams that lost points in
Milestones 3 and 4 due to missing feature implementation (excluding points deducted
for missing comment, missing weekly reports or missing peer evaluations). If your team
has failed to get a full score in Milestone 3 or 4, you can implement the missing features
until December 4th with a -30% penalty. Such teams should provide a separate
document explaining (1) what features are missing and implemented now, and (2)
mention the explicit commit number or link to the pull request corresponding to the
implementation. The team should also resubmit a new video recording demonstrating
the features are implemented and that the app is functional. We will not regrade the
requests that fail to provide the requested information.

Deliverables: To finish the last milestone, please follow the tasks explained under the
“Tasks” section. Your deliverables are: (1) the “test code” pushed in the team’s GitHub
repository, (2) a simple report of the LLM-generated tests in the format of a pdf file,
including the information of what LLM is used, how you accessed it, the screenshots or
text of the full conversation with the LLM to generate the tests, and how you leveraged
its response to create test cases (i.e., did you make any changes to make the generated
tests compilable/runnable?), (3) the scripts of calling LLMs pushed in the team’s GitHub
repository (if you choose the first way in Task 2), (4) a video (no more than five minutes)
that captures the entire test suite execution and shows that all the tests pass, and (5) the
coverage report of test execution (optional with bonus points).

Grading rubric: Please find the rubric below.

Activity Points Team/Individual

Testing user sign up with LLM 10 team

Testing user login with LLM 10 team

Testing adding a new city 10 team


Testing removing an existing city 10 team

Testing user log off 10 team

Testing “Weather Insight” feature 10 team

Testing weather feature (two cities) 10 (5 each) team

Testing location feature (two cities) 10 (5 each) team

Mocking location test 10 (bonus) team (or


individual)*

Coverage report 20 (bonus) team (or


individual)
Peer evaluation 10 individual**

Total 90 (+30) team***

All tests should include assertions to check the behavior of the application against a
ground truth. Failing to include assertion results in losing half of the points for the
corresponding test.

* Bonus points can be given to the entire team (in case of a team effort) or individual
team members willing to get additional credits by doing the mentioned tasks alone.

** Note that those team members that fail to submit the peer evaluation form will lose
10 points compared to their teammates. We accept no peer evaluation form
submissions after the deadline. We accept no request for regrading if you fail to
properly submit the form (forgetting to submit the form, wrong team number, and using
a non-Illinois email address) .

*** Students reported by their peers not helping the team to progress on this milestone
will lose “at least” 50% of the total grade.

Note: Your TA will run the test suite on their local machine to reproduce the captured
video. In case of discrepancy, i.e., being unable to run the tests or observing failed tests,
you will lose the entire points of this milestone.

Tasks:

<<Task 1: Writing instrumented tests>>


You are required to manually write instrumented tests to test all functionalities except for
the first two (user sign up, user login).

While there are several test harness platforms for instrumented test development in
Android, we specifically ask you to develop “ Espresso” instrumented tests:
https://developer.android.com/training/testing/espresso

You can read more about Espresso here:


https://www.waldo.com/blog/testing-with-espresso-guide
https://kobiton.com/blog/a-complete-guide-to-espresso-android-testing/

When writing tests:


1. Use representative test names. For example, a representative test name for
validating user login functionality could be “checkUserLogin” or “test UserLogin.”
Failing to use representative test names can make grading harder for our team.
So, please help us grade this last milestone as soon as possible.
2. Your test may exercise more than one Android component during execution. For
example, it may click on multiple widgets in Activity 1 and transition to Activity 2
through a button click to validate an assertion.
3. All your tests should include actions accompanied by assertions. For example, a
test for validating the signup functionality enters the username and password,
and clicks on the signup button. The assertion then should check if the new
user’s information has been successfully added to the database (or its
equivalence, depending on the implementation). Similarly, a test validating the
weather functionality can check if the location name on the screen showing the
weather information matches the name of the location of interest. Your assertion
should be relevant to the test scenario. Otherwise, you will lose half of the points
for a corresponding test.
4. When testing the “Weather Insight” feature, you only need to verify the app's
behavior, e.g., pressing the question buttons triggers the behavior of showing
corresponding responses. You don’t need to make assertions on the content of
the questions and answers as they are dynamically generated by LLMs.
5. The bonus activity of mocking location can be done for either weather functionality
or showing map functionality. For example, you may click on the map of Chicago
to be shown and then mock the location to Champaign as the next step in the
test, and your assertion should check if the shown map belongs to Champaign or
not, i.e., if mocking works.
6. Instrumented tests are fast, i.e., the time between performed actions is in the
order of milliseconds. Consider small pauses between test steps when recording
the video so your TA can follow the test execution. The simplest way is adding
Thread.sleep() between test steps/actions.
You can find some Android test samples in the following GitHub repository:
https://github.com/android/testing-samples

<<Task 2: Generating test cases using LLMs>>

You are required to generate test cases using LLMs to test the first two functionalities
(user sign up, user login).

You are free to use any LLMs you prefer. There are two primary ways to interact with
LLMs and generate test cases. You can choose either of them:
1. Write scripts to call LLM APIs. There are a lot of tutorials available regarding how
to access LLMs and call them in your code. For example, OpenAI API Tutorial
(https://platform.openai.com/docs/api-reference/introduction) and Hugging Face
Tutorial (https://huggingface.co/docs/transformers/llm_tutorial). Feel free to find
any other resources you think are useful to you.
2. Interact with the LLMs through their websites. For example, you can throw your
prompts in OpenAI Playground
(https://platform.openai.com/playground/chat?models=gpt-4o) or ChatGPT
(https://chatgpt.com) directly and then get the response.

Both of these two ways require detailed prompts about scenarios. LLM cannot write valid
code/tests with user-defined variables/functions if not provided with relevant information
in the prompt. That means instead of writing a simple standalone prompt, you need to
collect useful information (e.g., function specification, function implementation, variable
definitions, class context, etc.) from the code base and integrate all of them, including
your requirements, into your prompts. For the basic prompt design, you may refer to the
following case:
The figure above shows an example of writing a basic prompt for the getEnvironment
method and using ChatGPT for generation. It consists of two parts: (i) the natural
language description part (i.e., NL part) that explains the task to ChatGPT, and (ii) the
code context part (i.e., CC part) that contains the focal method and the other relevant
code context.

a. CC Part. It includes the following code context: (i) the complete focal method,
including the signature and body; (ii) the name of the focal class (i.e., the class
that the focal method belongs to); (iii) the field in the focal class; and (iv) the
signatures of all methods defined in the focal class.

b. NL Part. Based on the widely-acknowledged experience of using ChatGPT, it


includes the following contents in the NL part: (i) a role-playing instruction (i.e.,
“You are a professional who writes Java test methods.”) to inspire ChatGPT’s
capability of test generation, which is a common prompt optimization strategy;
and (ii) a task-description instruction (i.e., “Please write a test method for the
{focal method name} based on the given information using {Junit version}”).
The top half of the figure shows an example of the basic prompt. After querying with the
basic prompt, ChatGPT then returns a test as shown in the bottom half of the figure.

Please note that the requirements of test cases are consistent with Task 1.

<<Task 3: Optional-Coverage report generation>>

Android studio has a built-in feature to collect coverage information of the “unit test.”
However, it is possible to integrate JaCoCo code coverage collection tool with Android
Studio to generate coverage reports for the instrumented tests. Create a separate
pull request on the team’s GitHub repo for this bonus task and upload the coverage
report generated as a result of running the test suite to the repo.

<<Peer Evaluation>>
Each team member should submit the following form for the peer evaluation for
Milestone 5 (we accept no peer review for this milestone after the milestone deadline):

You might also like