Skip to content

Performance regression in shlex.quote from 3.13 to 3.14 #146385

@bonzini

Description

@bonzini

Bug report

Bug description:

#132036 included an algorithmic change to shlex.quote that made it slower when the input has to be quoted. This is because the regular expression search was able to short-circuit at the first unsafe character.

However, the isascii check is worthwhile

Cc @picnixz

import re
import shlex
import timeit

# From 3.13
_find_unsafe = re.compile(r'[^\w@%+=:,./-]', re.ASCII).search
def old_quote(s):
    """Return a shell-escaped version of the string *s*."""
    if not s:
        return "''"
    # BEST: if s.isascii() and _find_unsafe(s) is None:
    if _find_unsafe(s) is None:
        return s

    # use single quotes, and put single quotes into double quotes
    # the string $'b is then quoted as '$'"'"'b'
    return "'" + s.replace("'", "'\"'\"'") + "'"


g = {'old_quote': old_quote, 'new_quote': shlex.quote}
print('with spaces')
print('  old', timeit.timeit("old_quote('the quick brown fox jumps over the lazy dog')", globals=g, number=1000000))
print('  new', timeit.timeit("new_quote('the quick brown fox jumps over the lazy dog')", globals=g, number=1000000))
print('without spaces')
print('  old', timeit.timeit("old_quote('thequickbrownfoxjumpsoverthelazydog')", globals=g, number=1000000))
print('  new', timeit.timeit("new_quote('thequickbrownfoxjumpsoverthelazydog')", globals=g, number=1000000))
print('non-ASCII')
print('  old', timeit.timeit("old_quote('mötley')", globals=g, number=1000000))
print('  new', timeit.timeit("new_quote('mötley')", globals=g, number=1000000))
print('short')
print('  old', timeit.timeit("old_quote('a')", globals=g, number=1000000))
print('  new', timeit.timeit("new_quote('a')", globals=g, number=1000000))

sample output:

with spaces
  old 0.4148377259989502
  new 0.5036935329990229
without spaces
  old 0.3872929839999415
  new 0.3540855330065824
ascii
  old 0.4636239370011026
  new 0.20726546400692314
short
  old 0.1217202929983614
  new 0.2977778149943333

CPython versions tested on:

3.14

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.14bugs and security fixes3.15new features, bugs and security fixesperformancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions