New string formatting in Python

scrollaway · on Jan 3, 2016

String interpolation is one of those features where I'm really confused about the direction the Python team wants to take. It's implicit and magic, and yet another non-obvious way to do do string formatting.

What's more, it's being hailed at a plus for localization which it isn't. Localizers should never, ever deal with string interpolation - anything past what .format() does is essentially untranslatable.

hyperpape · on Jan 4, 2016

Why is it implicit and magic? Looking at this post, and the PEP, it seems like interpolation is actually pretty close to .format in semantics, but syntactically simpler. There are a few bits that I didn't quite follow in the PEP, so I may just be missing the issue, however.

saurik · on Jan 4, 2016

The important thing for localization is to allow the strings to be swapped at runtime by loading them from some kind of database in a way where the arguments are at least numbered, if not named, so as to let them appear in different orders for different languages. The interpolation syntax does not seem, to me, allow for this: it is an "implicit" syntax offered during parsing to take a constant string and immediately build it out of parts, as opposed to the "explicit" % operator or .format methods: it is extremely clear with these two models how one would load a template that has the strings in a different order, or which used a different subset of the strings.

    >>> '%(one)s %(two)s'%{"one":"hello", "two":"world"}
    'hello world'
    >>> '{one} {two}'.format(one="hello", two="world")
    'hello world'

The PEP that is looking at this problem has been deferred: at the end of the document they kind of indicate that they didn't really think about the i18n problem correctly and so don't actually have a good solution to present, and are going back to think about it more.

https://www.python.org/dev/peps/pep-0501/.

sametmax · on Jan 4, 2016

If you need to localize, you still have format(). You just happend to have a shorthand to do it for the 80% case, just like you have @decorator as a shortcut for decorated = decorator(decorated) or [x for x in z] as a classic loop alternative.

Yes, having so many solutions is not ideal, espacially when it comes to teach the language. However, it would be foolish to avoid improving the usability of Python just to avoir having "one more way to do it".

I do wish they'd deprecate Template though. It's more than useless.

mixmastamyk · on Jan 4, 2016

string.Template is the one used for i18n, it is simple for the end user so has a use.

kevin_thibedeau · on Jan 4, 2016

I suppose they need to add an API so that f-strings can be automatically presented to gettext or another translation database layer before substitution is performed.

mixmastamyk · on Jan 4, 2016

It isn't for the most part, the GP is incorrect, except for the confusion.

yeukhon · on Jan 4, 2016

Being a die-hard Python user on a daily basis, I concur. I think these core devs should not invent too many new syntax and ways to solve problems which don't need to be added to the language.

I am not sure why i18n is a big deal, let another library deal with it.

Whatever, I have the choice not using this feature after it is accepted and implemented. Their PEP discussion on email always ended up in tangential. Problem I always have with string manipulation is dealing with long string, which for coding style I'd split into multiple +, and thus using format is pretty ugly.

scrollaway · on Jan 4, 2016

You don't have to use + to join strings on multiple lines. Here:

    >>> (" foo "
    ...  "bar.")
    ' foo bar.'

skj · on Jan 4, 2016

The other day I spent some time tracking down a bug where someone forgot a trailing comma in a multi-line parameter list.

That was annoying.

Luyt · on Jan 4, 2016

Ok, but this only works for literals.

scrollaway · on Jan 4, 2016

Yes, which is what I'm replying to.

js8 · on Jan 4, 2016

I agree with you. It is already possible to use quote and plus in the same way as the bracket in the new way, with the added advantage that the original syntax is more orthogonal.

Is it really worth updating all the Python syntax formatting and analysis code out there just to save one character on an operator? I don't think it's a good tradeoff.

sametmax · on Jan 4, 2016

+ is a terrible operator for formating.

You got:

    price = 18.8
    date = datetime.datetime.now()

Now you want to save in a file:

    "It's 18.80 and we are the 01/04/2016"

With +:

    num, dec = str(price).split('.')
    msg = ("It's "+ num + "." + dec.ljust(2, '0') + 
          " and we are the " + date.strftime("%m/%d/%Y") + "\n")
    f.write(msg)

It is so long and ugly we have to break it on several lines to respect PEP8.

And it's a pain to write, or even worst, to read.

With print():

    num, dec = str(price).split('.')
    print("It's", num, ".", dec.ljust(2, '0'), 
          " and we are the ", date.strftime("%m/%d/%Y")), file=f)

A tiny bit better since we don't have to use that many concatenation tricks.

If we didn't have to format the float, print() would have converted it to string for us which is nice (with + you have to call str() all the time).

With "%":

    f.write("It's %.2f and we are the %s\n" % (price, date.strftime("%m/%d/%Y")))

Way better, but still hard to read.

With format():

    f.write("It's {:.2f} and we are the {:%m/%d/%Y}\n".format(price, date))

Now we are getting sometwhere...

With fstrings:

    f.write("It's {price:.2f} and we are the {date:%m/%d/%Y}\n")

fstrings have everything:

    - easier / faster to type than any other versions;
    - easier to read than any other versions;
    - as expressive as any other versions.

It's reduce work by making you think less, type faster, read faster (for you and your colleague) and it's easier to spot bugs.

The only place where you don't want fstring are for l10n.

efaref · on Jan 4, 2016

I don't think it's even that bad for l10n, as you can treat the expression-like format specifiers as opaque keywords for the placeholders.

For example, your i18n tool can extract:

    "It's {price:.2f} and we are the {date:%m/%d/%Y}\n"

as a key, and then l10n can map this to:

    fr_FR: "C'est {price:.2f} et nous sommes le {date:%d/%m/%Y}\n"

(Here we can fix up the silly US date formatting, too, although really you shouldn't be localising dates in your format strings).

All that remains is for them to work out some way of mapping in that localised string, as you can do this:

    _("It's {price:.2f} and we are the {date:%m/%d/%Y}\n").format(price=price, date=date)

any more.

mixmastamyk · on Jan 4, 2016

> (Here we can fix up the silly US date formatting, too, although really you shouldn't be localising dates in your format strings).

ISO 8601 (similar to Japanese format) is the most logical way to format dates: YYYY-MM-DD, easily sortable, not the silly way US and Europeans format their dates. ;)

https://xkcd.com/1179/

scrollaway · on Jan 4, 2016

Date formatting preferences are done at the desktop level and apps should query for it. All frameworks I can think of have a way of doing so, and you can just tell them "I want a long-form date & time", "I want a short date", "I want a short non-numeric date".

If you are a fan of ISO8601 (which I am), you can then set this globally on your desktop, rather than expect the app developers / translators to choose for you.

eru · on Jan 4, 2016

Someone made a good argument that in order to support lots of languages, you actually need general functions not just strings to be put into .format() or interpolation.

That's because some languages have complicated changes in the text depending on eg the number of things: not just singular/plural, but more complicated. Russian is one example.

cbsmith · on Jan 4, 2016

That's a little misleading.

In practice, what you need is a notion of a "context" for your formatting (often something that is aware of both the current i18n and l10n factors), and that context needs access to the fundamental objects that would be rendered in to a string.

Effectively, the "format string" becomes an identifier for the particular message you want to render, though often the objects themselves are definitive enough.

In that paradigm, interpolation/formatting/whatever might be the default mechanism employed by the context, but you don't want to have that as your explicit mechanism. You want one more level of indirection before you get to it.

Some platforms make the mistake of having that context be tied to the thread, out worse still, a global, and that falls apart once you have any multiplexing logic (and the way python currently works with posix locales fully captures how terribly you can do this). Either way, I'd argue out really is a different problem from interpolation, and if you are trying to solve it using interpolation, you don't understand the problem.

tenfingers · on Jan 4, 2016

Can you clarify the last point? I do not get it. Can you make an example of where an extra indirect mechanism (besides interpolation) is _required_ for translation?

I generally localize for western languages and write in many programming languages. The combination of gettext + python format strings has been working really great for me, and generally much better than other systems I've seen and put to use. In fact, the simplicity of gettext provides a very fast turn-around, and with translators experienced with the tool I never had problems. Python format strings also work great in this context, as I can supply an arbitrary dictionary of elements that the translator might need. The only real problem has been plural forms in complex text strings, where ngettext is not always sufficient.

What method (name a project I can inspect) do you recommend as a good localization architecture?

cbsmith · on Jan 4, 2016

You're looking for something like:

     l8n_context.format('file_not_found', file)

'file_not_found' is just an identifier to lookup the actual format template (likely a format string) that will be combined with the file object to render the error message.

tenfingers · on Jan 5, 2016

There's not much difference. In fact, you cannot expect translators to know how the underlying object is handled or write data extractor out of the file object.

If translators are cooperating with you, it's very easy to provide the needed elements directly in the format's dictionary (that is: you extract the translatable pieces for them). It also means you don't have to worry that they're going to fiddle with mutable state.

I generally write everything in english, and do back-translation to my own locale (I also cooperate for translating external projects into my locale), so I eat my own dogfood here.

I know I do not want to deal with extra lower-level subtleties here. Translation is hard already by itself. It's impressive how a good translation of a simple UI can take so much time. If I had to inspect the object to know what I can get out of it I would get crazy.

I'd take a pre-baked dictionary any time.

I've also already used the string-catalog approach in the past (heh, XUL), and I'd personally take gettext any day.

cbsmith · on Jan 6, 2016

But do you see how that extra level of indirection basically maks arguments about the different formatting styles for localization kind of irrelevant?

tenfingers · on Jan 6, 2016

I do, yes. But the translator needs to be aware of the extra indirection to produce the text he needs, and between a custom layer and a standard formatting syntax, the second is definitely friendlier for anyone approaching translation, even when technically less powerful.

At some point the translator will have to format some string himself.

cbsmith · on Jan 7, 2016

There's no reason why your "ID" can't be the template for your default language.

eru · on Jan 5, 2016

OK. How is that different from using functions? Conceptually, there's a function that does arbitrary computation under the hood.

(I think we are already agreeing, only that you express your point differently in Python specific terms. I don't care whether you stuff your code into a context object or something else. I was talking about having the full power of the programming language available, vs using a limited language like format strings.)

cbsmith · on Jan 6, 2016

The important part is you have one more level of indirection before you figure out the layout of the message.

mixmastamyk · on Jan 4, 2016

I'm a bit confused about all the doubt on such a common feature industry wide. No, it's not magic, it's a simple transformation, takes about 30 seconds to learn, and is becoming an industry standard among modern languages. See C#, Scala, JS, Swift, etc, not to mention bash and perl.

It also has little in common with i18n, the use cases differ too much. Perhaps in the future someone can figure out how to bring them together, but not today.

schmichael · on Jan 4, 2016

All of the languages you list have been criticized for adding too many language features.

Go and Rust are two notable new languages which require explicit formatting functions as opposed to relying on special string interpolation syntax.

pcwalton · on Jan 4, 2016

Well, Rust formatting is magical, unlike that of Go: it has to be because it has type-safe format strings, which can't be expressed in the normal language. But the magic is encapsulated into a macro, so the complexity doesn't leak into the language per se. Someone could write a similar string interpolation macro and replicate Python's feature without modifying the compiler if they wanted--though the fact that nobody has so far indicates to me that it probably isn't needed, as the {} syntax is awfully lightweight already.

mixmastamyk · on Jan 4, 2016

Interpolation is one of the features I require in a rapid prototyping language, it is useful every single day and not esoteric.

schmichael · on Jan 4, 2016

Python was and still is to some diminishing degree influenced by C, C++, and Java - none of which have language syntax for interpolating strings based on variables in scope.

icebraining · on Jan 4, 2016

I wouldn't say it was influenced by Java, seeing as all the core syntax (functions, classes, modules, etc) was already present when Java was first released publicly. The stuff added later doesn't come from it either (list comprehensions, generators, etc).

mixmastamyk · on Jan 4, 2016

Yes, though it has more in common with Perl and Ruby, which do.

mixmastamyk · on Jan 4, 2016

From another post:

    "{a} {b} {a}".format(a=a, b=b)
    "{a} {b} {a}".format(**locals())

Compared to this:

    f"{a} {b} {a}"

Sorry that's about 1000% better. This should have been the one way to do it, originally. It isn't magic either, rather a simple compile-time transformation to existing format syntax. There's nothing new to remember besides a large reduction in noise.

schmichael · on Jan 4, 2016

I'm in the "wish it was more explicit" camp. Would a fmt function be that terrible?

    fmt("{a} {b} {c}", a=a, b=b, c=c)

If you want to save typing, maybe use :a instead of {a}. Or ?a would have made plain old ? a nice positional variant:

    fmt("?a ? ?", a=a, b, c)

The main benefit of the fmt function is that it requires no syntax changes to the language and is trivially provided by a third party library for all past versions of Python.

That being said this ship has sailed. I guess I just take a more conservative approach to syntax changes than most.

Update: a bit sad to see my votes fluctuating wildly on this post. Please don't use votes to support or disagree with me: that's not what they're for. Please vote only based on whether you find this relevant.

scrollaway · on Jan 4, 2016

There's a fantastic "more explicit" version of it: str.format. This really is just an implicit version of str.format(). I just don't understand why such a significant syntax change would make it into python when it deviates so much from their usual mantra.

mixmastamyk · on Jan 4, 2016

Guido is not a purity robot, and has been looking for a way to have this feature since perl and python were duking it out in the early nineties. Practicality beats purity.

It isn't a large syntax change, string prefixes have existed since the beginning.

sametmax · on Jan 4, 2016

Yes. "Practicality beats purity". This part of the zen of Python, too many people forget about it.

People forget that Python is also about readability, and ease of code exploration is the shell. Fstring is a feature to make that even better.

mixmastamyk · on Jan 4, 2016

It is explicit, in that one must use the f'' prefix. There are no syntax changes as u'', b'', and r'' already exist. Positionals make bugs more likely.

schmichael · on Jan 4, 2016

It is implicit in that the interpolated values are specified anywhere in scope, not inline as with every other formatting option.

I consider specifying the values or variables alongside the formatting string a requirement to be considered explicit, but I can see how it's a matter of opinion.

sametmax · on Jan 4, 2016

    @decorator
    def decorated

Is doing:

    decorated = decorator(decorated)

The switch is implicit too. It's syntaxic sugar to gain a pratical and elegant syntax for a common use case.

It's not going to introduce vulnerability. It's going to make your code easier to write and read. It's going to make bug easier to spot in formating. It's going to make shell sessions easier.

schmichael · on Jan 4, 2016

The manipulation (@dec) is still right next to the thing being manipulated (def dec).

I'm not trying to be pedantic. It really does affect readability when syntactic sugar's affect spans an entire scope. Whether or not that effect on readability is greater or less than the gain by the syntactic sugar is always a matter of opinion. Obviously my opinion is out of line with Python's core devs.

schmichael · on Jan 4, 2016

It is a syntax change as before f"" was a syntax error:

    >>> f"foo"
      File "<stdin>", line 1
        f"foo"
             ^
    SyntaxError: invalid syntax

schmichael · on Jan 4, 2016

> ... though similar to adding a function that didn't exist before.

This is not true and exactly the distinction I'm trying to make:

Adding new packages, functions, objects, etc. can all be backported to older versions and alternative implementations. They also require no updates to ASTs, linters, syntax highlighters, static analysis tools etc.

Adding new syntax is backward incompatible (unless it's added as a from __future__ import to new old releases) and requires changes to all tools that parse Python syntax (the interpreter, ASTs, linters, transpilers, etc).

mixmastamyk · on Jan 4, 2016

Ok, true, though I would argue that it isn't entirely new syntax but rather a variation of an existing one.

It is a shame that linters will have to add a letter to their grammar also, but I argue that the everyday usability and readability for millions will outweigh this drawback.

mixmastamyk · on Jan 4, 2016

I suppose technically true, though similar to adding a function that didn't exist before. The syntax extension part of the feature (adding a letter to the grammar) is minuscule.

pvg · on Jan 4, 2016

https://news.ycombinator.com/item?id=117171

schmichael · on Jan 4, 2016

Ha, well, I was I wrong. Thanks for the correction.

meowface · on Jan 4, 2016

Some people deride string interpolation as not explicit enough, but I think it's extremely readable and adds clarity. Plus, with the `f` prefix, it's plenty explicit IMO.

MereInterest · on Jan 4, 2016

String interpolation is only being added for string literals, not for generic strings. This means that you still wouldn't be able to read a template from a file, then format text in.

    with open('template.txt') as f:
        template = f.read()
    formatted = template.format(**values)

mixmastamyk · on Jan 4, 2016

That's a good thing due to security reasons as arbitrary expressions are allowed. There are plenty of templating solutions available.

Animats · on Jan 4, 2016

Arbitrary string interpolation was proposed, in a way sure to result in security problems:

"PEP 498 proposes new syntactic support for string interpolation that is transparent to the compiler, allow name references from the interpolation operation full access to containing namespaces (as with any other expression), rather than being limited to explicit name references. These are referred to in the PEP as "f-strings" (a mnemonic for "formatted strings")."

"Full access to containing namespaces?" From strings? Bad, bad idea. This is currently marked as "deferred", but should be marked "rejected with extreme prejudice".

mixmastamyk · on Jan 4, 2016

FUD: applies to literals only. There is nothing that could happen in the string that couldn't happen in your code already. No new security concerns.

Animats · on Jan 4, 2016

Example given:

    myquery = sql(i"SELECT {column} FROM {table};")

What could possibly go wrong?

icebraining · on Jan 4, 2016

How is that any different from any other string formatting mechanism?

  myquery = sql("SELECT %s FROM %s" % (column, table))

berdario · on Jan 4, 2016

plenty of things can go wrong, since you have control of how many and which variables to interpolate, e.g:

    i"SELECT {settings.SECRET_KEY};"

PS: (I don't know why but HN is not updating the page with the reply links needed, so I'll just edit this)

Yes, I agree. I think that the misunderstanding happened when mixmastamyk wrote

> That's a good thing due to security reasons as arbitrary expressions are allowed. There are plenty of templating solutions available

The point is that even if you don't allow arbitrary expressions (which imho are a mistake, and of which I haven't seen a single use case yet), having this kind of interpolation from strings that are not literals (i.e. are not in the trusted source code) would still be a security issue

Since the PEPs apparently don't propose to extend this to non-literals, we're safe. But it's better to be wary and attentively review such proposals...

In fact, I just realized right now that Animats might have misunderstood PEP 501, since

    sql(i"SELECT {column} FROM {table};")

should be perfectly safe from sqli vulns

PPS: Unless Animats is pointing out how switching i'' for f'' is a terribly simple mistake to do and hard to spot during a code review... I agree with that

icebraining · on Jan 4, 2016

But again, if you can inject that into the source code, why can't you just do

  "SELECT {};".format(settings.SECRET_KEY)

Remember that the interpolation only works for string literals, you can't inject that from external input.

MereInterest · on Jan 4, 2016

I agree. I was pointing out that the f'{a} {b}' syntax can't accomplish everything. Since it isn't intended as a complete replacement, the way `.format` is intended as a complete replacement for `%`, it will just add another string formatting technique without ever deprecating the old way.

kissgyorgy · on Jan 4, 2016

Nothing wrong with this, because it's explicit. You could't be mo terse in this case, could you?

kevin_thibedeau · on Jan 4, 2016

This breaks gettext translations:

  _(f"English {a} words {b} here {a}")

The interpolation is done before the string is passed to gettext which can't retrieve the translated string any more.

mixmastamyk · on Jan 4, 2016

Correct, don't use it with i18n, the same as with any other string formatting.

richard_todd · on Jan 4, 2016

The motivation for PEP-0498 given in the article was the difference in verbosity between these two lines:

  "{} {}".format(a, b)
  "%s %s" % (a, b,)

That's not very convincing. I was hoping this article would make a good case for interpolated strings, since it's starting to feel like Python is having an identity crisis. Type annotations especially took me by surprise, but string interpolation is another good example of an addition that doesn't feel like Python (imho, anyway).

jdnier · on Jan 4, 2016

The more common case for me is to name the parameters. Especially with longer variable names, this gets tedious fast. Given

    >>> very_long_var_name_1 = 'spam'
    >>> very_long_var_name_2 = 'ham'

compare

    >>> # Explicit but tedious and doesn't help readability:
    >>> print('{very_long_var_name_1}: {very_long_var_name_2}'.format(
    ...       very_long_var_name_1=very_long_var_name_1,
    ...       very_long_var_name_2=very_long_var_name_2))
    spam: ham

with

    >>> # Explicit but somehow feels dirty:
    >>> print('{very_long_var_name_1}: {very_long_var_name_2}'.format(
    ...       **locals()))
    spam: ham

and

    >>> # Still fits on one line. I think f prefix makes intent clear.
    >>> print(f'{very_long_var_name_1}: {very_long_var_name_2}')
    spam: ham

toyg · on Jan 4, 2016

Thing is, * * locals() "feels dirty" but does exactly the same thing as f''. Except f'' does it implicitly, hiding dirt under the carpet. One day, that dirt will turn something into a bug; which is exactly why the Zen says "explicit is better than implicit".

The best way imho would be:

    vars = {'short1': very_long_var_name1, 
            'short2': very_long_var_name2}
    print('{short1} {short2}'.format(**vars))

Easy and extremely unlikely to ever include the wrong variable.

sophiebits · on Jan 4, 2016

Except a linter or static analyzer can easily be taught to understand f"", but it's essentially impossible to reason statically about locals().

mixmastamyk · on Jan 4, 2016

> Thing is, * * locals() "feels dirty" but does exactly the same thing as f''

That's false, it does not do that. It converts the string into the equivalent format call at compile-time. RTFP ;)

kissgyorgy · on Jan 3, 2016

I have mixed feeling about this. This will be the FOURTH way of formatting strings in Python. The other formattings will never go away.

bhaak · on Jan 4, 2016

What's the fourth way? The article only mentions three.

rdtsc · on Jan 4, 2016

4th way is with templates:

   >>> from string import Template
   >>> s = Template('$who likes $what')
   >>> s.substitute(who='tim', what='kung pao')

This will probably be TIL for many people. It is a surprisingly hidden feature.

I for one, still like the "%s" % x instead of "{0}".format(x). It is simply shorter and I already know or use printf for C in other parts of the project. But with the new f"..." interpolation, I can see liking that more. I am all for being as concise as possible while still being explicit (if that makes any sense at all ;-) ).

toyg · on Jan 4, 2016

I've always found templates way overkill for most of my needs, but they cover a specific and legitimate use-case (large and complex blocks of text).

In the same way, .format() solved the problem of inconsistency and bugs happening all the time with %. It was a restriction and formalization effort, trying to root out bad practices and hence slightly more explicit, but it made sense.

The new f'' IMHO does not solve anything beyond pandering to developer laziness. It will likely introduce bugs in places where developers are not clear about the context ("oh, I thought we didn't have a 'x' var at this point, turns out we do!"), because (from what I understand) it takes away the ability to define which variables should be considered.

Luckily it will take a while before it percolates in any significant library, but I sincerely hope it just doesn't gain much traction.

nothrabannosir · on Jan 4, 2016

Not the op but Im guessing just using + (which calls .__str__(), iirc?)

richardwhiuk · on Jan 4, 2016

+ doesn't call __str__ internally.

e.g the following is an error.

  a = 4
  "a: " + a

You have to do:

  "a: " + str(a)

barely_stubbell · on Jan 4, 2016

I agree that + is likely the 4th way.

A minor correction to your comment: "+" calls the __add__ method (big surprise) - just fyi

orionblastar · on Jan 4, 2016

I think they keep the other formatting for backward compatibility in legacy code.

monkmartinez · on Jan 4, 2016

I'll just leave this here: https://www.python.org/dev/peps/pep-0020/

"There should be one-- and preferably only one --obvious way to do it."

There are a lot of warts on that snake, but I still really like using Python.

aexaey · on Jan 4, 2016

PEP20 sound amazing, but then when this only way end up being way too verbose/ugly, you are kind of stuck with it. Compare:

    #python
    import re
    m = re.search('(a.+)(d.+)', 'abcdef')
    if m:
      print(f"{m.group(1)},{m.group(2)}")

    #perl
    if('abcdef' =~ /(a.+)(d.+)/){
      print "$1,$2";
    }

JoshTriplett · on Jan 4, 2016

I don't know why Python re match objects don't support indexing; a __getitem__ method would allow m[1] and m[2] instead (or m['name'] for named groups).

That aside, though, I don't consider brevity alone the most critical criteria for a programming language; that way lies APL. Expressiveness, yes, but not at the expense of clarity.

orf · on Jan 4, 2016

There are a couple of warts for sure, but Python is remarkably wart free. Compare it to something wart ridden like PHP or JavaScript.

sametmax · on Jan 4, 2016

keyword is "preferably".

Also "Simple is better than complex", "practicality beats purity" and "Readability counts".

The zen is not the bible, you don't get to cherrypick the stuff you want to make your case.

Plus, they are just guide lines, in the end, you have a debate in the python dev mailing list with reasonable people making their case.

eugenekolo · on Jan 4, 2016

Preferably one. Unfortunately they put in `format`, and can't remove it now. I think moving towards f-strings is a lot better and should be embraced.

mixmastamyk · on Jan 4, 2016

Format is still needed for non-literals and user objects, this is just an extension of it.

mixmastamyk · on Jan 4, 2016

"Practicality beats purity."

Rangi42 · on Jan 4, 2016

Also not a fan. The two existing methods of string formatting are sufficient:

    "%s %s %s" % (a, b, a)
    "{} {} {}".format(a, b, a)

If you want placeholders to match variable names, you can do:

    "{a} {b} {a}".format(a=a, b=b)
    "{a} {b} {a}".format(**locals())

So this is just unnecessary (especially since the "f" prefix is easier to miss than a "format" method):

    f"{a} {b} {a}"

And this is downright obfuscated—putting operators inside of string literals:

    f"{a + ' ' + b + ' ' + a}"

eugenekolo · on Jan 4, 2016

I certainly agree about your last statement, I have no clue what that would do without reading the spec. I'd guess an error, but apparently not.

roywiggins · on Jan 4, 2016

The PEP says that locals() and globals() solutions were considered, and discarded:

https://www.python.org/dev/peps/pep-0498/#no-use-of-globals-...

You could argue that's not a good enough reason, but it's there.

c3534l · on Jan 4, 2016

At the very least, it sure looks a hell of a lot better. And using symbols and single-character abbreviations for types (in a dynamically typed language, mind you) and an overloaded % which has the extra syntactical rule that it only takes a single argument and so multiple replacements have to be done by throwing them all into a tuple is FAR more intuitive and readable than the way most languages do string formatting. Plus the alternatives, that require you bounce back and forth between the string and the variable it is replacing rather than read left to right the way humans are meant to read strings of text is a terrible design. Something as simple as string formatting should not rely on arcane and nuanced rules which are more or less arbitrary.

You've all been programming in C-like languages for far too long to realize what a horrible design string formatting is. You can argue over "explicitness" all you want, the new way is easier to learn, easier to read, makes more intuitive sense, requires learning fewer rules, ad is close enough to the format string method that they work well together.

losvedir · on Jan 4, 2016

As an outsider to the python community and full time ruby dev, this "controversy" baffles me every time it comes up. Lightweight string interpolation is obviously better! I think you all have string Stockholm Syndrome or something.

The one counter argument that makes sense to me is that in general we shouldn't be doing easy string interpolation, since that way lies SQL injection, XSS, etc, and should instead rely on a stronger type system with binary text blobs, HtmlStrings, SqlStrings, etc, with automatic escaping into and out of the data type.

But then that's not the case with Python now. If you're only trying to stick this string inside that string in a quick and dirty manner, I totally don't understand the reticence folks have to something the way ruby does it: "Name: #{first_name}".

scrollaway · on Jan 4, 2016

If it's obviously better, would you care to share some better arguments than "people who disagree have stockholm syndrome"?

Don't get me wrong, I'd like Python to have better, more obvious, more concise string formatting. However the last time we had this discussion, it was about str.format() and how it was going to be awesome and don't worry modulo-formatting will go away.

Turns out it did not; modulo formatting is still there because why would it be removed. This is history repeating itself - are you actually baffled that some people learn from past mistakes?

mixmastamyk · on Jan 4, 2016

All you have to do is try it once or twice to enjoy the readability and lack of positional mismatch bugs.

None of the other techniques are going away. This time yes, no one is naive enough to think so.

scrollaway · on Jan 4, 2016

"Try it!" is not an objective argument (I have tried it extensively eg. in Bash and I neither like or hate it), and positional mismatch is not a concern when you use name-based syntax, which interpolation forces you into due to its nature.

mixmastamyk · on Jan 4, 2016

I've made numerous posts in this discussion, so not going to copy them again here. The previous named-based syntax comes with significant redundancy and noise.

As a ruby dev posted here, it is obviously better in most respects in most common cases.

sauere · on Jan 4, 2016

Not a fan. We already have 2 ways of formating strings in Python, we really should not bring in a third one.

It may be true that

    "{} {}".format(a, b)

is a bit verbose, but it is crystal clear and clean. Just remember the Python Zen: "There should be one-- and preferably only one --obvious way to do it." _and_ "Explicit is better than implicit."

mixmastamyk · on Jan 4, 2016

It is the same thing as .format without all the noise.

Grue3 · on Jan 4, 2016

No, it's the same thing as .format, except limited to literals, so basically useless (hardcoding literal strings is bad for i10n).

mixmastamyk · on Jan 4, 2016

Useless... have you never used a shell? It's quite handy.

Grue3 · on Jan 5, 2016

If they're adding features purely for the convenience of python shell users, how about, I don't know, remove the need for significant whitespace (by adding "end block" statement). Because Python shell is a major pain to use when you try to copy-paste some code into it and it doesn't like the indentation levels. On the other hand I can copy-paste whatever into my Lisp REPL and it will run just fine. Or allow several statements per lambda. Saving a few keystrokes writing ".format" doesn't even register on the same scale of annoyance.

mixmastamyk · on Jan 6, 2016

I've been using Python heavily since about 2000.

Approximately 2.5 times since then has whitespace been a problem, and which I fixed in under 10 seconds each time. Yet, the readability gains from removal of block delimiters in that time frame is uncountable.

> Saving a few keystrokes writing ".format" doesn't even register on the same scale of annoyance

It isn't just .format, it is:

    .format(long_variable_name1=long_variable_name1, 
            long_variable_name2=long_variable_name2)

This is a huge win that should have happened long ago.

Grue3 · on Jan 6, 2016

But you don't need to use long variable names in Python shell! Even one-letter vars are perfectly acceptable.

mixmastamyk · on Jan 7, 2016

When I wrote "shell" I was thinking of interpolation of bash, etc.

syrusakbary · on Jan 4, 2016

Quite useful! If you want to have string interpolation in Python 2.6+ just use https://github.com/syrusakbary/interpy

kseistrup · on Jan 4, 2016

“They” could put string interpolation in the __future__ module:

from __future__ import string_interpolation

:)

soheil · on Jan 4, 2016

Am I crazy or PHP had this feature since forever? String templates can significantly improve programming speed specially for those of us who are keyboard impaired. Not having to remember how many %s' you need or if one of them should be %d and so on may to most seem just knit picking but personally it makes a day and night difference.

MereInterest · on Jan 4, 2016

PHP has had it forever. Given PHP's reputation, I'm not sure if that is a good argument for this feature. Rather than '%s' and '%d', why not just use '{}' and '.format'? That way, each argument is converted to a string appropriately. If it becomes confusing with many arguments, it is easy to name the parameters '{myparam}'

hisham_hm · on Jan 4, 2016

> That actually looks pretty nice but as Python 3.6 is slated for release in another 12 months you will have to wait a little longer.

...Or just do it in 28 lines of Lua:

http://hisham.hm/2016/01/04/string-interpolation-in-lua/

This is a nice showcase of how Lua's metamechanisms can be applied to do things that often require new features in other languages.

kedean · on Jan 4, 2016

I'd argue that it's not a good thing that you can just disregard block scoping willy-nilly from within unprivileged code.

hisham_hm · on Jan 4, 2016

If you are in a context where you are concerned about "privileged vs. unprivileged" code, you have to sandbox way more than that, and Lua provides you many mechanisms to control that (more than any other mainstream scripting language). Disabling the "debug" library which I used there for traversing block scopes is the very first thing one does when sandboxing in Lua.

kedean · on Jan 4, 2016

At which point you can no longer use this fmt construct, so what was the reason for having it?

hisham_hm · on Jan 13, 2016

You can, privately within the module.

pvdebbe · on Jan 4, 2016

I for one welcome the shorthand interpolation. String formatting is the thing we all do all day long (slightly exaggerated pretty much).

Now if python would begin to support immutable values by default then I'd be most content, and Python complete enough.

jdnier · on Jan 4, 2016

+1 for https://pyformat.info/ (mentioned in article), with some nice practical examples of equivalent %s and .format() formatting.

rshaban · on Jan 4, 2016

There should be one-- and preferably only one --obvious way to do it.

Hoping for a decision from the core team – having both f"" and .format is a pretty clear deviation from this principle.

noobermin · on Jan 4, 2016

Well, having both % and .format was already a clear deviation from it.

btmiller · on Jan 4, 2016

Though I think the communication has been pretty clear that developers should stop using % formatting in favor of .format when moving to Python 3. Not so much a deviation, just a migration. Now that there's another alternative, there definitely needs to be some clear communication about what's going to be idiomatic going forward.

warbiscuit · on Jan 4, 2016

Except that they went and made .format() useless for bytes... which made everyone have to hang on to % for both 2/3 compatibility cases, and for all the cases where bytes templates were actually needed (all kinda of lowlevel wire protocols and file storage formats).

sandGorgon · on Jan 4, 2016

The rift between Python 3 and Python 2 seems to be a fallout of the one-true-way philosophy. In fact, it almost stands the reason that Python 3.6 ought to be Python 4 if one-true-way needs to be upheld.

If Python 3.6 is going to introduce multiple ways to do the same thing, there is no good reason to not merge Python 2 and 3 together and have both set of behavior co-exist with each other (__future__ or __past__).

In one shot, you break the Berlin wall of Python.

sametmax · on Jan 4, 2016

Apprently you haven't being using Python 3 much. It reduces the numbers of ways to do stuff a lot.

    - class stuff(object) vs class stuff;
    - range vs xrange;
    - itertools.izip vs zip;
    - itertools.imap vs map;
    - itertools.ifilter vs filter;
    - dict.items vs dict.iteritems vs dict.viewitems;
    - dict.items vs dict.itervalues vs dict.viewvalues;
    - dict.items vs dict.iterkeys vs dict.viewkeys;
    - __cmp__ vs __eq__ + __gt__;
    - sorted(cmp) vs sorted(key);

They didn't fallout from the philosophy. They have been pragmatic and tried to balance the language design : gaining modern features vs making a robust base vs pleasing the legacy crowd. Is. Is. Very. Hard.

And yes, we would all prefer to have less way to format. Would it mean I would prefer NOT to have fstring ? Certainly not, it's a great feature. We can't live in the past because it will make us not stand perfectly to the ideal we have.

Real life is not ideal.

But merging Python 2 and 3 ?

With the string model completly reworked, that would be apocaliptic. Most people don't realize how deep the unicode change has been.

I have been dev and teaching Python 2 and 3 for years. The amount of problems linked to UnicodeDecodeError dropped by 90% after the switch.

Not because Python 2 model didn't work.

Because nobody understand text.

Most dev don't understand what text is. They just want to format string. That's what Python 3 helps to do, and it does it well.

Mixing both would be like mixing olive oil and vanilla ice. Great on their own way, but use them together all you'll get a terrible meal.

sandGorgon · on Jan 5, 2016

at no point am I arguing whether python 3 is better or worse than python 2. That is an uninteresting conversation because the fact remains that less than 30% of all software is in python 3. Even when Google released Tensorflow, it was in Python 2 (and they employ core developers of Python).

Do you seriously think anybody is thinking of dropping python 2 support by 2020 ? that will only create a fork. None of the core frameworks have upgraded in a decade. Look at Flask for example. So yes, I havent been using Python 3 much - there has been no reason to.

the point I'm trying to make is how to get everyone on the same page. The reason Python 3 was api incompatible was because of the core tenet of one-true-way. All the functions you mentioned may be superior, but unless you give people a way to mix and match both in the same source , you will not have adoption.

Or do you think Python 3 adoption has been successful ?

bootload · on Jan 4, 2016

"PEP-0498 tries to improve this situation by offering something that has been common to other languages like Ruby, Scala and Perl for quite some time: Interpolated strings."

P3 '.format 'is fine, the only problem I have is forgetting the last ')' and vim picks this up. Is interpolation that good to introduce another way of doing things?

blubb-fish · on Jan 4, 2016

> by offering something that has been common to other languages like Ruby, Scala and Perl for quite some time

So funny ... the most prominent language known for this stuff is PHP ... and missing :D

such_a_casual · on Jan 4, 2016

I never understood why I had to write (count=count, apples=apples, name=name) when a logical default would suffice.

kissgyorgy · on Jan 4, 2016

I suppose you mean to take the variable names of the arguments without using keyword arguments? That's not possible because .format() is just a simple method call on str/unicode objects.

such_a_casual · on Jan 4, 2016

>That's not possible

That statement is against my religious beliefs. Apparently we can stick an f in front of the string, but we can't have default values that work with locals()?

mixmastamyk · on Jan 4, 2016

That would mean the formattee would have to dig thru its surrounding namespace. This solution is more elegant, the string is transformed into the equivalent format call at compile time, making the implementation tiny.

sitkack · on Jan 4, 2016

Python provides many ways to 'dig thru' the surrounding namespace. I find it ironic that do so is effortless in the language while at the same time is considered unpythonic. Which is it? Is the language itself unpythonic?

mixmastamyk · on Jan 4, 2016

Stack frame digging is discouraged yes, I'm ok with that.

Also, this alternative solution would not allow arbitrary expressions, which was a design goal.

such_a_casual · on Jan 4, 2016

Elegant for who?

Adding a new syntax that doesn't exist in any language.

vs.

Using sensible default parameters.

-----------------

An argument for why my suggestion would be bad: https://www.python.org/dev/peps/pep-0498/#no-use-of-globals-...

mixmastamyk · on Jan 4, 2016

> Adding a new syntax that doesn't exist in any language.

Not true, look into the new interpolation features in C#, Scala, JS, Swift, nim, etc.

klibertp · on Jan 4, 2016

> Adding a new syntax that doesn't exist in any language.

Please, even JavaScript (ES6) got itself something similar; implicit string interpolation is not that rare.

makecheck · on Jan 4, 2016

If you have a dictionary "d" somewhere with the keys that should become keyword arguments then this would work:

    f(**{x:x for x in d})

pramodliv1 · on Jan 4, 2016

One can also do "{count} {apples}".format(locals()) if we are playing code golf.

Well, star-star locals. I don't know how to write it according to https://news.ycombinator.com/formatdoc

evrial · on Jan 4, 2016

Just because simply one way isn't enough.