PHP formal semantics?

1.9k views Asked by At

I am tasked with learning PHP, but there are many things I don't understand. For example, the concept of "variable functions" is not one I've seen anywhere else. There are many other examples, but for brevity, I found PHPWTF, which has many examples of PHP's idiosyncrasies.

Most other languages I've used have either a formal specification (e.g., Haskell 2010) or at least a research paper on their formal semantics (e.g., this for Javascript). However, I can't find anything comparable for PHP.

There is an official "language reference". However, it is very informal, reads like a wiki, and is missing entire sections (e.g., the section on syntax doesn't define the syntax at all). Confirming what I suspected, this guy tells me that there is no official specification, nor even a defined syntax.

Wikipedia has an article on "PHP syntax and semantics", but it only touches on the syntax, and barely mentions semantics.

One paper I've found on PHP is this paper on its assignment semantics. This is a very small fragment of the language and probably not much use to me without some context. There is also this paper on 'SaferPHP', which presumably has to work with some definition of PHP, though I couldn't see any.

Interpreters/compilers provide a semantics, so I thought to look at these. However, the Zend source is intimidating (though it does provide useful test cases), and HipHop runs to 2.7 million LoC. (I find it amazing that people have poured enormous effort into writing compilers for a language without ever writing something like a specification.)

I thought of looking at type systems for PHP for guidance, much like TypeScript provides some guidance for JavaScript. I found these tantalising slides on Hack, an optional type system for PHP. However, it's just slides, and the project seems to be an internal one at Facebook at this time.

Does anyone know of anything better than these poor man's semantics? Or does everyone just "learn by example"?

5

There are 5 answers

0
Daniele Filaretti On BEST ANSWER

This answer comes a bit after your initial question, but now we finally have a formal semantics for PHP. Check it out: http://www.phpsemantics.org. A paper about it has been recently published in the ECOOP 2014 proceedings, if you are interested you can find the link in the webpage I linked. Regards.

5
halfer On

Interesting question. I'd regard the manual as the official language reference; I appreciate it isn't quite "formal reference" in the sense you are seeking, but I don't know how much such a thing would be widely desired as something to learn from.

I'm not familiar with PHPWTF, but I'd guess it is in the same mould as the blog post Fractal Of Bad Design (linked by @alexis earlier). I can't peer into the mind of either author, but it seems to me that they are written from the perspective of wanting PHP to be bad. Religious wars frequently dominate on the internet and in programming — the browser you prefer, the IDE/editor you use, your operating system and your choice of framework have all had the same ferocious, partisan and unyielding treatment. Programming languages are, sadly, no different.

It is certainly true that PHP does have a number of design inconsistencies, in particular about how nulls are treated, and in the ordering of parameters in standard functions. However, it is also true that PHP has been hugely successful, despite all that. It spent a long time in the reliability doldrums in 5.0 and 5.1, 5.2 was stable but arguably not enterprise, and it's finally coming of age in 5.3 onwards.

Whilst this might be my biases emerging, I sense a consensus amongst users I read on Stack Overflow that all of the popular languages have their place. This is partly a response to the reality that the ones we dislike won't go away, and partly perhaps that learning .net, Java, Perl, Ruby, PHP, Python etc is pretty much always a good thing. Maybe we have also collectively tired of the flame-wars over each (Java is bloated, PHP is inconsistent, Microsoft is vendor lock-in, Rails is unstable, and so forth).

I've veered rather off-topic, but I tend to regard this particular viewpoint as worth reading, especially for those who would be traditionally minded to disagree with it in relation to PHP.

To address the purpose of your question, how should you learn? Well, learning by example is an excellent approach - one just needs to know which examples to learn. Searching for "PHP tutorial" and "PHP beginner" will — perhaps as is the case with any language — offer a mix of excellent and dreadful material. One might argue that PHP's low barriers to entry have given rise to a large stock of insecure and badly written "how to" articles, and I've certainly seen quite a few!

I think the solution is to look directly at code from well-engineered projects, and to learn from there. Such as:

  • Symfony2 (and Components)
  • Zend Framework
  • Guzzle
  • Propel
  • Doctrine

Ah, nearly forgot; this website is also a good place to start.


Post Script: they may be referred to by a different name in other languages, but I expect they all have variable functions. In JavaScript for example, it's object[myFunc]();, where myFunc is a string.

3
alexis On

It seems that you're not after an official standard (which might be useful, for example, to someone writing an independent conforming implementation), but for a presentation of the language that will allow you to make coherent sense of it. Unfortunately there cannot be such a thing, because PHP does not have a coherent formal model behind it. It has grown organically and is now saddled with inconsistencies, most notoriously in function and method naming but also in little details like what counts as true and false, and other similarly worrisome details.

The best one can do to approach PHP, in my opinion, is to get a good feel for the core features and libraries, for the "gotcha's" that you need to watch out for, and (in order to read existing code without distraction) for the anti-patterns that are all too common in real-world PHP scripts. My guess is that it's best to learn PHP under the tutelage of people who know how to work with it effectively, but I didn't have that luxury. (Regarding the documentation: It took me forever before I noticed that you can use square brackets to index into strings. The feature may be mentioned somewhere in the documentation, but not, back then at least, anyplace where it belongs.)

This article gives a nice tour of the kind of things that make a semantic model of the kind you want impossible. (You may want to skip the opening rant and go straight to the discussion of PHP features.) There are many, many other similar texts. Quote: "PHP was originally designed explicitly for non-programmers (and, reading between the lines, non-programs); it has not well escaped its roots."

Don't get me wrong: I work with PHP, and although it's not my favorite language, I wouldn't say I hate it. I would say that to work effectively with it, one must be aware of its nature and limitations. If you're coming to this from Haskell, you're in for quite a shock.

1
Kevin Crawley On

Doesn't directly address your inquiry, but explains some of the magic behind PHP variables.

http://webandphp.com/how-php-manages-variables

0
jameshfisher On

It's not exactly a formal semantics, but, after all these years, the HHVM project has produced a PHP specification!