This article is a mirror article of machine translation, please click here to jump to the original article.

View: 35533|Reply: 1

[Source] Regular expression basics

[Copy link]
Posted on 6/18/2019 9:38:16 PM | | |
This post was last edited by Kongcicada on 2019-6-18 21:39

preface
When doing some data matching, rule qualification, and crawler analysis data in the project, we will use regular expressions. The following is a summary of the basic knowledge of regularity, all of which are study notes from the early years.

Text

#Regular expression basics


.           Represents any single character other than \n
[ ] Character filtering
[^] Equivalent to non
|           means or
() Change the priority of the operation.
* qualifier, which indicates that the previous expression occurs 0 or more times.
+ qualifier, indicating that the preceding expression must appear 1 or more times. It must appear at least once.
?          qualifiers, indicating that the preceding expression must appear 0 or 1 times.
{n} qualifier, which qualifies the expression that precedes it must occur n times.
{n,} qualifier, which qualifies the preceding expression to occur at least n times.
{n,m} qualifier, which qualifies the preceding expression to occur at least n times and at most m times.
^ $ is the beginning and end of the string
\d is equivalent to [0-9]  
\D is equivalent to [^0-9]
\s represents all those whitespace characters that are invisible
\S is all characters except \s.
\w Indicates [0-9a-zA-Z_]
\w is all the other characters except \w.
\b indicates the boundary of the word. (Assert, judge only, mismatch.) )
=================================================


.    Represents any single character other than \n
a.b
a,b
=========================================
[ ] Character filtering
a[0-9]b
a[a-z]b

a[0-9a-zA-Z]b
a1b
axb
aAb

a[^0-9]b means that only any single character other than 0123456789 can appear between a and b.

a[^0-9a-z]b

=====================================================
|  means or


z|food due to| has a very low priority, so this expression can match z or food this expression does not match zood

(z|f)ood means zood or food

===========================================
() Change the priority of the operation.

Extraction group.

=======================================
* qualifier, which indicates that the previous expression occurs 0 or more times.

zoo* means zo zoo zoooooooooo
(zoo)* indicates zoozoo.......
a.*b stands for AB AADDDDB AFJDSKLF%$#@dsklfjdsklfjdsklfjb


================================================
+ qualifier, indicating that the preceding expression must appear 1 or more times. It must appear at least once.

a.+b
a9dfjsakl3824urnj324239feb
==================================================
? qualifiers, indicating that the preceding expression must appear 0 or 1 times.

a.? b
ab
axb


? Another function is to "end the greed mode". Regular expressions default to greed mode.

======================================================================
a[0-9]+b

a0b
a00b
a09b
a99999999999999999999b


========================== other qualifiers =====================
{n} qualifier, which qualifies the expression that precedes it must occur n times.
a[0-9]{10}b
a1234567899b
======================
{n,} qualifier, which qualifies the preceding expression to occur at least n times.

1[a-z]{3,}2
1axffdsafdsafdasfdsafdsafdsafdsfdsafsdfdsfdsfdsa2



========================================
{n,m} qualifier, which qualifies the preceding expression to occur at least n times and at most m times.

a[0-9]{3,7}b
a0000000b

===========================================

^ indicates the beginning of the string

$ indicates the end of the string.


^ and $ represent the two features of the string. One indicates the beginning feature and the other represents the end


^abc.*xyz$     ^abc122345xyz$   

^abcdefg$     ^abcdefg


fdsfdsfxyz   xyz$



===========================================
a[0-9]b
a\db

\dEquivalent to [0-9]
digital

\D   [^0-9]


\s represents all those whitespace, invisible characters
a\s*b
ab
a                                            




b

\S is all characters except \s.



=================================================
\w [0-9a-zA-Z_]  
word means word character.

\w is all the other characters except \w.

\b indicates the boundary of the word. (Assert, judge only, mismatch.) )

============================================

.


The following methods can indicate that any single character appears between abs.
a[\s\S]b
a[\d\D]b
a[\w\W]b






#Actual combat

1: Create a new console application

2: Paste the following code, you can test module by module






Epilogue

Regular expression online test   The hyperlink login is visible.
















Previous:mysql how to change the password of the root user
Next:Java Reactor - Reorganize your Java code
Posted on 5/5/2020 4:32:41 PM |
A complete list of commonly used regular expressions
https://www.itsvse.com/thread-9181-1-1.html
(Source: Architect_Programmer)
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com