summaryrefslogtreecommitdiff
path: root/Doc/library/re.rst
diff options
context:
space:
mode:
authorDongHun Kwak <dh0128.kwak@samsung.com>2018-08-22 15:55:10 +0900
committerDongHun Kwak <dh0128.kwak@samsung.com>2018-08-22 15:55:25 +0900
commit7628643e8630407914001b95e687a33e0a5715bb (patch)
tree5a392199e62b7932b55adf4d9b8138157389a8fb /Doc/library/re.rst
parent6f2d28a8c10b2a342d42c4f495ea8844cce77d74 (diff)
downloadpython-7628643e8630407914001b95e687a33e0a5715bb.tar.gz
python-7628643e8630407914001b95e687a33e0a5715bb.tar.bz2
python-7628643e8630407914001b95e687a33e0a5715bb.zip
Imported Upstream version 2.7.14upstream/2.7.14
Change-Id: Icfe8dc39f6e866f9cdf059cfd57789fed01f9469 Signed-off-by: DongHun Kwak <dh0128.kwak@samsung.com>
Diffstat (limited to 'Doc/library/re.rst')
-rw-r--r--Doc/library/re.rst34
1 files changed, 27 insertions, 7 deletions
diff --git a/Doc/library/re.rst b/Doc/library/re.rst
index 7b76d0c..d0798b7 100644
--- a/Doc/library/re.rst
+++ b/Doc/library/re.rst
@@ -33,6 +33,12 @@ module-level functions and :class:`RegexObject` methods. The functions are
shortcuts that don't require you to compile a regex object first, but miss some
fine-tuning parameters.
+.. seealso::
+
+ The third-party `regex <https://pypi.python.org/pypi/regex/>`_ module,
+ which has an API compatible with the standard library :mod:`re` module,
+ but offers additional functionality and a more thorough Unicode support.
+
.. _re-syntax:
@@ -480,7 +486,9 @@ form.
IGNORECASE
Perform case-insensitive matching; expressions like ``[A-Z]`` will match
- lowercase letters, too. This is not affected by the current locale.
+ lowercase letters, too. This is not affected by the current locale. To
+ get this effect on non-ASCII Unicode characters such as ``ü`` and ``Ü``,
+ add the :const:`UNICODE` flag.
.. data:: L
@@ -511,8 +519,9 @@ form.
.. data:: U
UNICODE
- Make ``\w``, ``\W``, ``\b``, ``\B``, ``\d``, ``\D``, ``\s`` and ``\S`` dependent
- on the Unicode character properties database.
+ Make the ``\w``, ``\W``, ``\b``, ``\B``, ``\d``, ``\D``, ``\s`` and ``\S``
+ sequences dependent on the Unicode character properties database. Also
+ enables non-ASCII matching for :const:`IGNORECASE`.
.. versionadded:: 2.0
@@ -689,11 +698,22 @@ form.
Added the optional flags argument.
-.. function:: escape(string)
+.. function:: escape(pattern)
+
+ Escape all the characters in *pattern* except ASCII letters and numbers.
+ This is useful if you want to match an arbitrary literal string that may
+ have regular expression metacharacters in it. For example::
+
+ >>> print re.escape('python.exe')
+ python\.exe
+
+ >>> legal_chars = string.ascii_lowercase + string.digits + "!#$%&'*+-.^_`|~:"
+ >>> print '[%s]+' % re.escape(legal_chars)
+ [abcdefghijklmnopqrstuvwxyz0123456789\!\#\$\%\&\'\*\+\-\.\^\_\`\|\~\:]+
- Return *string* with all non-alphanumerics backslashed; this is useful if you
- want to match an arbitrary literal string that may have regular expression
- metacharacters in it.
+ >>> operators = ['+', '-', '*', '/', '**']
+ >>> print '|'.join(map(re.escape, sorted(operators, reverse=True)))
+ \/|\-|\+|\*\*|\*
.. function:: purge()