Hacker News new | past | comments | ask | show | jobs | submit login
Python alternatives for PHP functions (php2python.com)
129 points by weslly on Dec 25, 2012 | hide | past | favorite | 81 comments



I think in some cases, it would be better to explain why the PHP function was a bad idea to begin with, rather than letting people bring it's mess over to Python. For example, Python doesn't need dumb shit like nl2br() [1].

[1] http://www.php2python.com/wiki/function.nl2br/


Here was I thinking "oh well, topic mentions PHP, how long before people start popping up with obligatory completely irrelevant "PHP sucks" posts?" And here you are.

nl2br is a useful function. People use it. If you don't need it, don't use it. How calling it names is helpful to anything? You didn't even bother to explain why "the PHP function was a bad idea to begin with" or why it's a "mess" or why Python people would never need it. But your certainly found time to call it "dumb shit". Here comes the downvote.


I think that your idea is correct. I would rephrase it to say, "... explain why the PHP function does not belong in the language specification or the standard library to begin with..."


Not disagreeing what you said, but your quoted example of Python implementation of `nl2br` is wrong, the original PHP's implementation did handle all (\r\n, \n\r, \n and \r), see: http://php.net/manual/en/function.nl2br.php & http://phpjs.org/functions/nl2br/


Here's what's going on behind the scenes in PHP: https://gist.github.com/4372596

(Taken from https://github.com/php/php-src/blob/PHP-5.4.9/ext/standard/s... )


What's wrong with nl2br?

Seems to do what it says on the tin.


Except the problem most people actually want to solve is not "convert newlines to <br>", it is "convert some text to html", and this function may lead to think that it does that, when it only does a tiny fraction of that. And that's how injections are born.


I've never once had the impression that nl2br did anything more than make whitespace significant in an HTML document. Even during my first days of using PHP... I don't think there's anything in the documentation or even trivially basic experimentation that could reasonably lead someone to believe that.

Plus the actual HTML-escaping tools (htmlspecialchars, htmlentities) do not make whitespace significant.

Though these days, you might arguably be better off with "white-space: pre-line" in CSS instead.


It does less than making HTML whitespace-significant. It's unsuitable for use on HTML markup, because newlines in <script>, inside tags, attributes, comments, etc. should not be changed.

It's only safe and reliable as a part of nl2br(htmlspecialchars()) combo, so a function that does both could have been a better idea.


If you're dealing with ascii plaintext the main significant difference is how it deals with new lines vs HTML.

The function never purports to do anything other than convert newlines to BR tags.


What is being suggested is that the plaintext may contain a random html tag (Perhaps if supplied by a user, or perhaps because that tag is meant to be displayed as plaintext as well. The reason doesn't rally matter, "tags" in any situation are still valid plaintext). By passing text to something that replaces new lines with br the implication is that it is now safe to drop into HTML -- however now that tag from before can take effect (particularly bad if its a script tag). Thus, this function doesn't make sense unless it is at least also coupled with HTML escaping.


> By passing text to something that replaces new lines with br the implication is that it is now safe to drop into HTML

I don't see how that's implied at all. After all, the function is named nl2br, not html2text.


It's implied because basically the only context under which <br>'s are used is when appearing in HTML. If someone is taking text, converting the newlines to <br>'s, then there's a 99.9% chance that the next step is that that text is going to be placed in a larger HTML document. Unless of course I'm forgetting some alternative use of <br>'s.

I agree the function does exactly what it says it will do. And if this was a private function used by something like text2html internally, then maybe it might be a fine function. However, as a public function, the argument is that it inspires bad programming practices, since again, it is almost certainly being used as a primitive form of "sanitation" or "conversion" before displaying plaintext in a larger HTML document.

I think if you could come up with an example of how this would be used NOT as an immediate precursor to dropping into HTML I could be convinced otherwise (and saying it is used after the other tags go through a sanitation process is a poor response, since it means this function must always follow the other one -- further proving its uselessness as a standalone function).


I have never, ever seen nl2br referred to as making anything secure or safe. It just converts new lines to <br />s. That's what the manual says it does. That's what tutorials say it does. That's what the function name very obviously shows.

I think map() from Python should be removed. Its name implies to a new learner that it will draw a map, but it actually does nothing to that effect at all! No, it maps an array to a function. We must rename this dangerous function to call_a_function_on_every_element_of_an_array - or, even better, remove it from the language core ENTIRELY. If it was a private function used inside the runtime, maybe that would be fine, but it's a public part of the API.


>I have never, ever seen nl2br referred to as making anything secure or safe.

There is also no mention in the manual that it is unsafe! One of the big problems with PHP is how easy it is to write dangerous code and how the standard manuals and tutorials often give little explanation to this.


It should be better documented to not directly print user input in HTML context, and there should be a very obvious best-practices[1] guide for newbies explaining what to do (and what not to do). But nl2br could only be dangerous is you misunderstand the function's name, description, sample code, and everything else on the documentation.

[1] as if they would read it...


> By passing text to something that replaces new lines with br the implication is that it is now safe to drop into HTML

> I don't see how that's implied at all. After all, the function is named nl2br, not html2text.

Absolutely every example from the documentation http://php.net/manual/en/function.nl2br.php uses it exactly in this manner: taking the output and immediately outputting it to the resultant HTML document. I've already described why this is unsafe (take any of these examples, replace the string with something like "Everyone knows 4 < 5", and it breaks the document due to the inclusion of "special" characters).

Now you feel that the correct use of this function is so obvious that it merits mocking my belief that it may be misunderstood by users (despite the comments on that very documentation page describing how they use it as a simple text to html converter). So given that it is so obvious to you, I repeat my original request: just give me an example where nl2br isn't ultimately used to transform plaintext before outputting it to HTML.


Have you actually read the PHP documentation page for nl2br? People are absolutely using it that way and not making any remark about safety or security. As an anecdote, when I started using PHP, I began to use nl2br to change the newlines in my HTML to <br> tags and output them. You can mark that up to me being a bad developer I guess but I literally got the idea from the PHP manual.

Also, your criticism of map() is kind of childish. It doesn't imply to a new learner that they will draw a map, nor does the documentation even hint at anything like that. In the Python documentation, they are given a clear use case and, if they are familiar with programming (or linguistics), understand that usage of the word map as a verb. Don't be obtuse about PHP's bad documentation.


Which is fine, as long as you make sure that your text doesn't contain any characters like "&", "<" or ">".


It's named quite well and explains exactly what it does. If you think this converts text to html you haven't read the manual. RTFM before programming or get out of the fucking field.


Why do trivial functions have to be added to the language core?


Why not? Most languages have a standard library that does all kinds of simple but useful stuff. PHP is aimed at the web so a function that deals with part of the web/non-web mismatch seems like a useful thing to include.

Removing it would simply cause newbies to have to wrange with str_replace to build trivial web apps.


Here's my problem with nl2br:

Prior to 4.0.5, it used "<br>". As of 4.0.5, they switched to "<br />". (As of 5.3.0, they did the obvious thing and added a second parameter, is_xhtml).

This isn't an isolated incident -- any minor update is liable to change how a function works or what parameters it can take. So you're better off writing it yourself (or doing it inline with a string replace or regular expression).


4.0.5 was released in 2001, more than a decade ago. When complaining about a language, complain about a current, feature breakage, not an old one that has since been fixed (5.3.0 came out in 2009, 3 years ago).


So, where's a problem? That is some godforsaken version that is dead for many years and nobody uses it it was working differently? It's not a problem, it's a feature - the language evolves and changes according to what people need. It happens with all languages and all libraries and all code.


4.0.5 is a minor version. Minor versions shouldn't change the behavior of existing code (except to fix bugs).


4.0.5 was more than 11 years ago. 5.0 was released 8.5 years ago. Why, let's discuss something that happened 11 years ago as if it is relevant to anything now.


Actually it can be quite a useful function. In your __unsubstantiated__ opinion it's dumb.

EDIT: You want to talk about dumb. An incomplete object model that has no concept of protected members and doesn't enforce encapsulation on "private" ones. This is worse than PHP4 and pales in comparison with PHP5's object model.


What's wrong with not enforcing privacy? I have not seen privacy be abused writ large in Python programs, except in cases that are very practical (mostly finicky forms of testing that hit a few critical code paths).

I have fairly recently written code in both Python and Ruby, and of the many qualitative differences in how these cultures and languages influence projects written in them, it has never occurred to me that 'real' privacy in Ruby vs. convention-based privacy in Python was the cause of any noticeable difference.


Nothing wrong with it. Like nothing wrong with having nl2br function. Some don't need private class members. Some don't need nl2br. The wrong is when people think whoever doesn't need exactly what they need is dumb.


True, but why bake it into the language if it's not enforced? Most OOP requires encapsulation which requires private members and the ability to enforce this privacy. It should not be up to the user to enforce it, as that will never happen.


What's wrong with it? It's a bug by design that allows a useless feature (access to private members ... why not make everything public then?) and allows the destruction of a key concept of OOP: encapsulation.

Which is perfectly fine if you don't need encapsulation.


It seems your question is "why is there a 'private' variable convention, for something that isn't enforced by the language?"

The shortest answer is that it changes how people read the program and treat symbols branded with the underscore.

Empirically this has not caused rampant abuses or even unintentional mistakes of abuse of otherwise internal members, and so I think without more evidence that it's causing a problem now that this design experiment -- ill advised or otherwise -- has been tried and seems quite successful. Hence, an appeal to philosophy is not at odds with the implementation. The simple rule is "don't do that," coupled with "and it should be obvious when you are." Just as you probably shouldn't break into another class via reflection to use its symbols, as seen in Java or .NET, or use .send in Ruby, but still can. Python opted -- mostly for reasons of implementation complexity reduction -- to just do nothing at all.

I think this viewpoint changes quite a bit in a language that is amenable to being statically analyzed, though.


Thought for a second that somebody had invented a Python library with functions that work identically with their PHP counterparts. Like php.js [1].

Turns out it's much more than that. Rather than handing out ready-made functions to make Python behave like PHP, this site actually teaches you how to write Python like a real Pythonista. For example, str[2:5] instead of substr(str, 2, 3). Well done!

[1] http://phpjs.org/


> Rather than handing out ready-made functions to make Python behave like PHP, this site actually teaches you how to write Python like a real Pythonista.

Not in my opinion. Here is my personal experience of using this site when I was just starting with Python. I had landed up there looking for a function similar to PHP's `array_fill_keys`. And it showed me `dict.fromkeys(keys, value)`[1] without a slightest of warning that unexpected things will happen if its passed a mutable data structure such as an empty list as the value.

So IMO, for people coming from PHP background writing some of their first Python code, it can be helpful at times but not an alternative to learning Python from a book or some serious online resource. Moreover, the site itself says that it won't teach you either PHP or Python.[2]

[1]: http://www.php2python.com/wiki/function.array-fill-keys/ [2]: http://www.php2python.com/about/


Not to mention the idiomatic way might just be one of:

    # before generator expressions (generates intermediate list)
    result = dict([ (k, value) for k in keys ])
    
    # with generator expressions (lazy iteration)
    result = dict((k, value) for k in keys)
    
    # with dict comprehension (brand new, probably the fastest)
    result = { k: value for k in keys }


    d = dict.fromkeys(keys, value)
is fine if value is immutable e.g., a string, number.

Your code doesn't solve the mutability problem (each value is the exact same object. If you modify it for one key; the values are modified for all keys).

For a mutable type you need to create a new value for each key:

    d = {k: [] for k in keys}


This works fine too,

   dict((k, []) for k in keys)
Although I later realized what I really needed was defaultdict[1]

   d = defaultdict(list)
[1]: http://docs.python.org/2/library/collections.html#collection...


> Your code doesn't solve the mutability problem

I know that, but I willfully replicated the original code behavior and made it idiomatic (which has the advantage of making it both obvious and easily adjustable)


cool job right there, but just curious, why this is php >> python and not php << >> phython ? does that need a lot more work?


Because mostly everyone hates PHP. This site serves to show the Python way of doing things if anyone's interested, or is moving away from PHP.

Edit: I am wrong, not everyone hates PHP.


"Everyone hates PHP"? Really?

It might not be the most popular language around the HN crowd, but this forum is subject to trends and "emerging technologies" - not necessarily to say that these technologies are better or worse, just that they are the flavor of the month.

It's hard to argue with the numbers, PHP is still the most popularly used language [1] and while the gap is being bridged, it is really minuscule in comparison. I'm sure that while you may hate PHP, not every single developer does.

Yes, yes, down vote me, I'm a proud PHP developer and to even utter such words on this site seems to result in the most powerful of criticisms, but subjective to this entire argument is that behind every person who writes PHP is a ruby coder, tutting and shaking his head, stood behind him is a Node.js developer, laughing at him, behind him is somebody writing Assembly, scowling in to the distance in disgust, behind him is a C coder, eyes wide with surprise that someone is working in Assembly, etc.

I give it a few years until people are crying out loud on here that Node.js is dead and superlanguageemulator.pm or whatever comes next is the only way to code, at least, for people wise and intelligent enough to embrace this new technology.

[1] http://w3techs.com/technologies/overview/programming_languag...


That link isn't really credible at all. There are two obvious and major omissions that render their chart meaningless.

They don't say how they determine whether a technology is being used server side -- it could be only if they see ".php" URLs on the site, for example, which would mean there would be a strong bias against languages that make it easy to write sane URLs.

They also don't share what % of sites they're able to make a determination for at all, so these numbers could be based on a statistically insignificant sample.


You only have to go on the job market to see what languages are the most popular; PHP, JAVA and the various .Net (Mostly VB and C#).

JS is on the rise, but no one hires anyone to write just JS. It's usually JS and PHP (and of course, HTML, CSS).

And another point, outside the bubble (I mean a place like Florida) it's hella easier finding a decent PHP guy than Python (or Ruby) guy. I learned that the hard way.


From what I can see, in my area (north England) the job market is almost entirely dominated by PHP and .net openings, with some Java on the side. Ruby and Python only very rarely get a look in, I imagine it would actually be difficult to find work in this area if you only knew Ruby or Python.

The job market in CA is very different to pretty much everywhere else in that emerging and niche technologies tend to be under-represented outside of CA, most likely because of budget constraints in web development, most clients want a simple Joomla or WordPress site and couldn't care less about anything else as long as they don't have to pay over £2,000 for it.


Re: the first para, definitely. I still get emails from recruiters (probably left something up on an old CV site I've since forgotten about), and every single one up north involves PHP and MySQL. The more imaginative roles include a bit of jQuery.

I've never received offers for anything else.


>> but no one hires anyone to write just JS.

You are joking. Right?

There are many good job opportunities for JS developers. I don't know where you got the idea otherwise.

FWIW, I was hired to do just JS.


I got the idea from hiring. Look at what was being advertised for; what I was competing against.. But I live in Florida. What area of the country are you?


By default PHP includes an X-Powered-By header in the response containing the PHP version used (should be turned off in production but is often not). This is a more common thing to track.


> which would mean there would be a strong bias against languages that make it easy to write sane URLs

This is false. It is very easy to make sane URLs in PHP. You just designate index.php (or whatever you want, actually) as the default index in your http server. In fact, that's how it works in every language?


Did you just compare node.js to assembly? :)


Clearly not, it was a satire of our industry. I was comparing the attitudes of people, the languages they use were merely for the purpose of the narrative, it could quite easily have been Node.js and Perl or Python and Haskell


As someone who doesn't know much in the way of Python, it seems odd to me that many of the examples are blocks of code vs one line replacements for PHP functions. Is Python just more verbose?


I know both languages well. Python is much more succinct than PHP because of array literals, list comprehensions, lambda's and better native data types with ways to navigate them (all the functional methods are there). Only PHP 5.4 with a literal array syntax is beginning to approach some of the things you can do in Python.

You think PHP is short because you have functions like xmlrpc_encode_request and the equivalent in Python is 20 lines. What you don't see if that xmlrpc_encode_request is pages and pages of C code and PHP macro's behind the scenes. That function should be in a library, not in a programming language, which is why PHP is more comparable to other web frameworks, not to other programming languages.


It IS a PHP library - just one written in C rather than in user-land PHP code. Why is this a problem? The XML-RPC extension isn't even enabled by default.


xmlrpc_encode_request on this site should probably be fixed to use xmlrpclib.dumps (or loads if you need the other direction). This has existed in Python since version 2.2.

I realize this is beside the point, but I'm willing to fix it myself.


There is xmlrpclib inlcuded per default and it does all the XML stuff, so i think you can easily get away far more readable then what is shown in the python reimplementation of the PHP function. http://docs.python.org/2/library/xmlrpclib.html#example-of-c...

It just doesn't have xmlrpc_encode_request() with the exact method signature.

P.S.: I am wondering if it is normal to have functions marked as EXPERIMENTAL with a biiiig warning sign for 11 years in the language? That really scares me. This function has been introduced in PHP 4.1.0 which was releases in 10-Dec-2001... That's just...

edit: removed rant about xmlrpc_encode_request being built-in. It's a non-default extension, apparently..


No it's not.

The reason the examples appear longer is because the website is documenting how to implement PHP functions in Python. Since Python is not PHP, it won't implement line-for-line replacement functions.


No, just PHP has a tradition of implementing frequently used code in C. This means PHP has tons of functions that serve specific use case, while Python has less of them, at least in standard package (of course you can have a module that does the same and then you'd have one-liners too). So you're not comparing the same thing here.


Using both PHP5 in the past and Python more recently, I find that most things in Python need far less code to write than PHP.

The examples here are kind of disingenuous to that, but if you start writing code that takes advantage of the features of Python, your lines of code should be less than PHP or equal.


I don't think it's complete. For example this: http://www.php2python.com/wiki/function.array-walk/ could be replaced with map() in Python.


No, there are a lot of shortcuts and beautiful one-liners in Python. However, Pythonic code always focuses on being the simplest and cleanest as opposed to being the shortest. I suspect it's the author's way of imposing some good python coding standards at the same time.


I wish everyone hated PHP. Then one one would write it.


I agree that "hate php and love python" part, however, we may still need to use php or at least read php code at work/school. and this app could be useful for that purpose if it had python to php conversion too.


I am using Python since 6 months after working with PHP for many years. Guess what? I don't hate PHP.


I don't hate PHP, so you are wrong.


I actually found this quite useful today. A rare case where I needed to translate the Wordpress post decoding function to get the rendered HTML form from a post into Python. Soon pelican's importer will be better able to preserve content from Wordpress blogs.


I was pleasantly surprised to see this wasn't a link to php.py, the somewhat bizarre port of the PHP standard library to python:

http://code.google.com/p/php-py/source/browse/trunk/src/php/...


There's also PyHP to use PHP templates with Python: https://github.com/bendemott/pyhp


Just like with PHPJS (an effort to bring PHP standard library to JS) there are missing implementations:

http://www.php2python.com/wiki/function.mt-rand/


I suffer from the 'does everything in PHP because it's comfortable' habit, and this guide always helps when translating.


Very nice and useful.


PHP may have its flaws, but Python does not seem to be the answer either. By the way, have they already found a fail-safe way to visually represent the difference between a space and a tab? Since space is a keyword and a tab too, how does the programming with invisible differences in whitespace combinations work? Does it already support all possible unicode whitespace? I guess I should ask a high priest in Python whitespace calculus ... Why not simplify the whole thing and use brainfuck characters instead?


No one who actually uses Python has any day to day problems with white space.

Actually had one myself yesterday. Took me seconds to fix and was notable only because it was such a rare event.


I still have whitespace problems occasionally, though I know they're my fault and come from my bad setup.

Basically, my windows install of Vim and my *nix install are not using the same whitespace settings, and I sometimes have to fix a bunch of lines if I wrote them first in windows.

It happens infrequently enough that I haven't bothered to figure out exactly what the problem is, but I've been coding python for years without drilling this one down.


I do use python for work from time to time (not every day, but probably every week) and I still occasionally have problems with whitespace. It's one of the most annoying things with Python, especially if editing somebody else's code or working in foreign environment (another machine, etc.) where tools aren't configured in your standard way.


PEP8: "Use 4 spaces per indentation level."

If you're getting code from someone else in your company which doesn't, just hit them round the head with PEP8 until they learn.


It's easy to write a PEP, it's less easy to configure every editor in existence to use exactly 4 spaces, especially when normal humans prefer hitting tab once instead of hitting space 4 times.


Yeah, they found a good way of dealing with whitespace - it's generally done by not using notepad for editing source code and/or learning about 'expand' command in *nixes.


This must be the weirdest complaint about Python I've heard so far.


Generally one refrains from mixing tabs and spaces within the same file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: