Skip to content

Make stoip6 return whether the conversion succeed #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 25, 2018
Merged

Make stoip6 return whether the conversion succeed #72

merged 2 commits into from
Sep 25, 2018

Conversation

Taiki-San
Copy link
Contributor

Implement the change suggested by @kjbracey-arm in #71.
stoip6 now returns false if the string is too long, if characters besides hexadecimal characters or ':' are detected, or if it's missing fields.

/* Should really report an error if we didn't get 8 fields */
memset(addr, 0, 16 - field_no * 2);
//Report an error if we didn't get 8 fields
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case, for backwards compatibility, keep the memset in. We've already written something to the buffer. May as well fully fill it.

I'd like to minimise the chances of breaking a large body of code using this.

Although we now are being stricter on the is_hex check. Maybe that should carry on but set an "error" flag to return false at the end? Guarantee we don't break anyone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok with memset.

As for soft failing, I'm not totally confortable with such a lax parser, especially for something as important as parsing IP addresses. If you want to push to avoid breaking third-party code, I'll make that change but I'd rather see buggy/unsafe code break before being exploited by hackers. Your call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tend to agree, but the current code is currently returning void, and there are lots of users, so even if you make it return false, no current code is going to be checking the return.

By returning early, you make current users process uninitialised data rather a definite value they're getting at the moment.

Once people actually start looking at the return value, then it becomes a strict parser.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would memset-ing the entire output buffer to 0 every time we return false be an appropriate alternative?

Copy link
Contributor

@kjbracey kjbracey Jul 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems plausible. I'm just nervous about any change to the output values here, as I suspect there's not enough CI on this repo - we wouldn't see any problems until incorporated into upstream projects.

Hmm. Okay, make it clear when returning false. (Stylistically I'm fine with a goto error for that).

}

// First go forward the string, until end, noting :: position if any
for (field_no = 0, p = ip6addr; (len > (size_t)(p - ip6addr)) && *p && field_no < 8; p = q + 1) {
q = p;
// Seek for ':' or end
while (*q && (*q != ':')) {
q++;
//There must only be hex characters besides ':'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space after //, also below.

@@ -20,15 +20,17 @@
#include "ip6string.h"

static uint16_t hex(const char *p);
static bool is_hex(const char c);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of meaningless const in declaration prototypes - you're passing by value. You're free to retain it in the definition below, I guess, but there's no const size_t or void * const dest going on, so I think this is just randomly picked up from the line above where the const means something different. This is actually analogous to const char * const p.

(Argument variables can be declared const or not inside a function, indicating the constness of the copy. For the parameters, const is meaningless, and doesn't affect type-checking of definition versus declaration.).

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to be stricter, should probably reject any >4-digit segments. Could also reject coloncolon being set if already set.

}
return true;

error:
// Fill the output buffer with 0 so we stick to the old failure mechanism
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this exactly match the old failure in all cases (eg non-hex digits), so probably best to not claim that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the new comment less ambiguous or should I rather drop the mention of the old behavior ?

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine now apart from formatting. Make sure you run the unit test, and you should probably add a couple of tests for the new failure modes.

q = p;
while (*q && (*q != ':')) { // Seek for ':' or end
if (!is_hex(*q++)) { // There must only be hex characters besides ':'
goto error;
}
}

if((q - p) > 4) { // We can't have more than 4 hex digits per segments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting - space after if. "Per segment"

// Check if we reached "::"
if ((len > (size_t)(q - ip6addr)) && *q && (q[0] == ':') && (q[1] == ':')) {
if(coloncolon != -1) { // We are not supposed to see "::" more than once per address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting - space after if

@Taiki-San
Copy link
Contributor Author

Rebased the patches after fixing the formatting you mentioned.
Also tweaked the unit tests: it's now checking that the call succeed/failed and a new tests validate the failure in a few scenarios.

@Taiki-San
Copy link
Contributor Author

I don't have the toolchain to run the tests locally, but are they ran by the CI? 504b8b2 should have failed.

@kjbracey
Copy link
Contributor

I fear the CI is not running those unit tests on PRs. This repo is low-traffic enough we've not spent much time on automation. I can run them locally.

"FFFF:FFFF::FFFF::FFFF", // Two ::
"F:F:F:FqF:F:F:F:F", // Non-hex character
"F:F:F:FFFFF:F:F:F:F" // >4 hex characters in a segment
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing ;

uint8_t ip[16];
uint8_t correct[16] = {0};

const char* invalidArray[] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

"F:F:F:FFFFF:F:F:F:F" // >4 hex characters in a segment
}

for(uint8_t i = 0; i < 3; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

}

for(uint8_t i = 0; i < 3; ++i) {
CHECK(false == stoip6(invalidArray[i], strlen(invalidArray[i], ip)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misplaced )


for(uint8_t i = 0; i < 3; ++i) {
CHECK(false == stoip6(invalidArray[i], strlen(invalidArray[i], ip)));
CHECK(0 == memcmp(ip, correct, 16));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert failed, don't know which iteration.

uint8_t ip[16] = {0};
uint8_t correct[16] = {0};

CHECK(false == stoip6(addr, strlen(addr), ip));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertion failed.

uint8_t correct[16] = {0};
// This should stop parsing when too short address given.
// Despite partial parsing, the entire buffer should be filled with zeroes
CHECK(false == stoip6(addr, strlen(addr), ip));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertion failed.

@Taiki-San
Copy link
Contributor Author

This should fix the errors. I apologize for taking so much of your time on those trivial issues.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, tests pass.

@Taiki-San
Copy link
Contributor Author

Do you want me to rebase before merging?

@kjbracey
Copy link
Contributor

Yes, squash it together as much as is sensible to make it neat.

@Taiki-San
Copy link
Contributor Author

Taiki-San commented Jul 12, 2018

Just noticed we're fairly lax in regard to the len argument. Specifically, if the string isn't \0-terminated, we'll over-read it. This could cause issues if the same buffer is used to parse multiple addresses.
Not sure if we should be more aggressive with stopping/reporting error, as there is a high chance that this will break third-party code, thanks to off-by-one bugs.

@kjbracey
Copy link
Contributor

Probably worth tightening that. The primary use case for specifying length is when we're parsing things like http://[fe80::1] or fe80::/64 - some other layer has already identified a separator. So normally that next byte would be something that would stop the parse. I wouldn't expect someone to have a stale hex digit in the terminator position, but I guess we should be robust.

And hopefully everyone who is parsing stuff like the above is passing an accurate length showing the separator, or they'll now get 0.

@Taiki-San
Copy link
Contributor Author

If this change is what you had in mind, I'll squash it and this PR should be ready to merge.

if (coloncolon != -1) { // We are not supposed to see "::" more than once per address
goto error;
}
coloncolon = field_no;
q++;
len -= 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be len-- just to match surrounding style.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should be fine. Squash away.

@Taiki-San
Copy link
Contributor Author

Taiki-San commented Jul 13, 2018

Done, should be good to merge.

@Taiki-San
Copy link
Contributor Author

@kjbracey-arm Would you mind merging this PR? Once it's done, I'll be able to update ARMmbed/mbed-os#6293 to use those new APIs.

@kjbracey
Copy link
Contributor

kjbracey commented Aug 9, 2018

We'll be merging this shortly, but I'm afraid this isn't going to make the release cycle for mbed OS 5.10, so suggest just sticking with the APIs from your previous PRs initially.

@kjbracey
Copy link
Contributor

Reverted due to some system test failures, including in unit tests. (Actual PR checks are minimal on this repo, so not spotted earlier). @artokin will give some details.

When he does, please add the failures to the unit tests, and make sure you run them.

@Taiki-San Taiki-San restored the ipv6 branch September 26, 2018 10:36
@artokin
Copy link
Contributor

artokin commented Sep 26, 2018

During the test it failed to convert 2001::0/64 [32:30:30:31:3a:3a:30:2f:36:34], is_hex is failing with character 2f.
Another failure happens with address: [66:65:38:30:3a:3a:31]

When running unittest I get the following failure:

./stoip6_unit_tests 
..................
stoip6test.cpp:37: error: Failure in TEST(stoip6, ZeroAddress)
	CHECK(0 == memcmp(ip, ipv6_test_values[i].bin, 16)) failed

.
Errors (1 failures, 19 tests, 19 ran, 42 checks, 0 ignored, 0 filtered out, 0 ms)

@kjbracey
Copy link
Contributor

kjbracey commented Sep 26, 2018

I did believe that first one should fail, because it was up to the caller to detect the '/' before-hand, eg via sipv6_prefixlength, if it's a prefix context, and exclude it by restricting len, rather than the conversion function quietly stopping on '/'.

In other words, that first one should work if len == 7, but fail if len > 7. (And I guess len == 6 would be valid, and len < 6 invalid?)

@Taiki-San
Copy link
Contributor Author

I think I caught the underlying issue, will make a new PR shortly.
As for running the unit tests, I had a quick look and it looks like they're designed to run on linux with quite a few hard-coded paths. Using macOS, this makes running them non-trivial. If you have a working setup, could you give them a quick run?

@artokin
Copy link
Contributor

artokin commented Sep 27, 2018

Unit test passed now. I'm running other test to verify the functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants