ID:               48012
 User updated by:  dave at ifox dot com
 Reported By:      dave at ifox dot com
 Status:           Wont fix
 Bug Type:         Feature/Change Request
 Operating System: All
 PHP Version:      5.3.0RC1
 New Comment:

Thanks for the response, but I see you misunderstood my post.  I was
talking about string-to-boolean conversion, not string-to-integer
conversion.

Implicit conversion of string to integer is correct in PHP:

  '0' == 0      yields TRUE, of course
  '123' == 123  yields TRUE, of course
  ' 45 ' == 45  yields TRUE, of course
  '' == 0       yields TRUE, of course

I'm talking only about the special case that doesn't hold up well
logically.  In conversion to boolean (explicit or implicit):

  'Hello'       yields TRUE, string is non-empty
  ' '           yields TRUE, string is non-empty
  '123'         yields TRUE, string is non-empty
  '00'          yields TRUE, string is non-empty (regardless of zero
value)
  ' 0'          yields TRUE, string is non-empty (regardless of zero value)
  '0 '          yields TRUE, string is non-empty (regardless of zero value)
  '0.0'         yields TRUE, string is non-empty (regardless of zero value)
  '0x0'         yields TRUE, string is non-empty (regardless of zero value)
  ''            yields FALSE, string is empty
  '0'           yields FALSE, even though string is non-empty, simply because of
a single ASCII '0' ???

But wait, '0' is an alphanumeric string! PHP is now the only language
in the world, web or otherwise, that would make an assumption about a
string's NUMERIC value when converting to a BOOLEAN. It may have been
more appropriate to follow other languages which only analyze the
presence of CONTENT in the string.

Am I at least making sense?

Obviously you won't take the step to assuming '0.0' is false, or any of
the silly ideas people have submitted as reports ('false'), but why take
the initial step?

By the same logic that you would argue '0.0' is an alphanumeric string
and ' 0 ' is an alphanumeric string, and should not be interpreted as a
boolean false, you should argue that '0' gets the same protection from
coercion to a numeric value just for the boolean evaluation (and test).

Of course, you could have deviated more drastically, and performed a
numerical evaluation of every string, and see if it contains a zero
number, and that would be incredibly inefficent, but it would follow
your logic.

If you do see my point, and have compared to other languages, you may
see what I'm talking about.

A change of this behavior would yield fewer errors by programmers that
liked the C-, Java- and JavaScript-esque beauty of PHP but didn't catch
it in the documentation (most programmers). However, I realize why you
might not change the behavior -- existing code might make assumptions
about user input that is intended as numerical and where tests against
that numerical value are made.  But realize those programs are already
having to convert to integer anyway, because " 0" would be interpreted
as non-zero.  You see?

I just wanted to comment, and hope that you recognize that this is
unusual, and it leads to broken programs -- I know, I've fixed them!

Thanks of course!


Previous Comments:
------------------------------------------------------------------------

[2009-04-18 19:28:05] [email protected]

PHP is first and foremost a Web language, not a general-purpose
scripting language.  Since the Web is not typed and everything is a
string, I had to do things slightly differently early on to make PHP do
what people expected.  Specifically, "123"==123 needs to be true in
order to not have to type cast every single numeric user input.  Given
that, then it also follows that '0'==0 and if you continue with that and
consider that 0==false then it makes sense that '0'==false.

However, '0'===false is, of course, false.  This is why we have the
strict type-comparison operators in PHP.  

Basically if we change '0' to be true, then we also have to trickle
that change up resulting in '123'!=123 which would break every app out
there.  So, while I understand your point, it simply isn't going to
happen.

------------------------------------------------------------------------

[2009-04-18 19:05:48] dave at ifox dot com

Description:
------------
PHP Developers,

I typed a very logical explanation of why more careful thought should
have been put into the addition of this conversion:

  "0"  =>  FALSE

but the submission failed and so I will try to reproduce it.

When I learned about the exception of the string "0" converting to a
boolean false, I had a sick feeling in my stomach. An entire team of
language developers fell down the slippery slope caused by a handful of
programmers in 2003 and 2004 that wanted to shorten their code, and
thought that "0" -> TRUE was a bug.

Every other language in existence converts a non-empty string to a
boolean TRUE, because they prevail in this logic:  the only meaning that
SHOULD be derived from an alphanumeric string is its alphanumeric
content, unless first cast to a numeric (or other) type.  For a language
that tries to bring in the best of other languages, PHP should at least
mimic the logical behavior of those languages.

I have edited dozens of web sites to fix well-structured code by
skilled programmers that didn't expect this behavior, particularly when
checking for the existence of strings from databases or user input (form
POSTs).  As an example:

  if ($_POST["uid"] && ...) { ... }

now must be changed to:

  if (strlen($_POST["uid"]) && ...) { ... }

and sometimes there are dozens of these in series.  This begs
programmers to be sloppy and just let "0" fail as an unusual case even
when valid as input.  And even skilled programmers can't be expected to
read and catch this tiny exception in your documentation.

There are other ramifications from not respecting the alphanumeric
string for the purpose it was intended, to hold alphanumeric values.  I
will not expound here.  The fact that FORM POSTs yield strings is not
something to streamline in to the assumption that a user "meant" to
enter an integer.  Only a beginning programmer wanting a shortcut hack
would expect or want this behavior.

And what of " 0", "0 ", "00", "false", and "0.0"?  They are respected! 
Originally I thought the "0" conversion was an attempt to make
bool(string(FALSE)) == FALSE, but it already does since string(FALSE) ==
"" (although in other languages, it yields TRUE, because it is
appropriate to conclude that the string representation of any boolean
value has length and is therefore TRUE).

JavaScript, as a common example, understands what an alphanumeric
string is for:

  <script>
    if ('0') document.write('YES');
  </script>

This yields "YES", of course.  Unfortunately there are now people out
there posting that JavaScript is broken.  But they are beginners, of
course.  A language should never make assumptions about a programmer's
users' intent when providing input.  If a user intends "0" as a string,
why assume that is a numerical value?  Don't you need to now assume all
sorts of zero-value strings?

I have developed two loosely-typed languages, and I made the choice to
treat non-empty strings as TRUE.  I find PHP for the web very usable,
but I was completely surprised by this choice, and I'm sorry to say that
it has resulted in high-quality code yielding unexpected subtle failures
for its users.

I modified PHP (14 or so changes in the Zend engine) to remove the
feature and I made a patch.  The ill stomach went away, and PHP now
respects alphanumeric strings, but now I am uniquely conscious in a
world of assumptions yielding unexpected results.  I guess that's the
beauty and curse of open source.

What were the thought processes in creating this "feature"?  Please
consider its removal!

Dave May


Reproduce code:
---------------
$str = "0";
if ($str) echo "TRUE"; else echo "FALSE";
---
>From manual page: language.types.boolean
---

Expected result:
----------------
TRUE

Actual result:
--------------
FALSE


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=48012&edit=1

Reply via email to