re expression

Upload: diveshdutt

Post on 02-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Re Expression

    1/23

    C# - Regular ExpressionsAdvertisements

    Previous Page Next Page

    A regular expression is a pattern that could be matched against an input text.The .Net framework provides a regular expression engine that allows suchmatching. A pattern consists of one or more character literals, operators, orconstructs.

    Constructs for Defining Regular ExpressionsThere are various categories of characters, operators, and constructs that letsyou to define regular expressions. Click the follwoing links to find theseconstructs.

    Character escapes Character classes Anchors Grouping constructs Quantifiers Backreference constructs Alternation constructs Substitutions Miscellaneous constructs

    The Regex ClassThe Regex class is used for representing a regular expression.

    The Regex class has the following commonly used methods:

    S.N Methods & Description

    1public bool IsMatch( string input )Indicates whether the regular expression specified in the Regexconstructor finds a match in a specified input string.

    2

    public bool IsMatch( string input, int startat )Indicates whether the regular expression specified in the Regexconstructor finds a match in the specified input string, beginning atthe specified starting position in the string.

    http://www.tutorialspoint.com/csharp/csharp_preprocessor_directives.htmhttp://www.tutorialspoint.com/csharp/csharp_preprocessor_directives.htmhttp://www.tutorialspoint.com/csharp/csharp_exception_handling.htmhttp://www.tutorialspoint.com/csharp/csharp_exception_handling.htmhttp://www.tutorialspoint.com/csharp/csharp_character_escapes.htmhttp://www.tutorialspoint.com/csharp/csharp_character_classes.htmhttp://www.tutorialspoint.com/csharp/csharp_anchors.htmhttp://www.tutorialspoint.com/csharp/csharp_grouping_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_quantifiers.htmhttp://www.tutorialspoint.com/csharp/csharp_backreference_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_alternation_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_substitutions.htmhttp://www.tutorialspoint.com/csharp/csharp_miscellaneous_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_miscellaneous_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_substitutions.htmhttp://www.tutorialspoint.com/csharp/csharp_alternation_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_backreference_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_quantifiers.htmhttp://www.tutorialspoint.com/csharp/csharp_grouping_constructs.htmhttp://www.tutorialspoint.com/csharp/csharp_anchors.htmhttp://www.tutorialspoint.com/csharp/csharp_character_classes.htmhttp://www.tutorialspoint.com/csharp/csharp_character_escapes.htmhttp://www.tutorialspoint.com/csharp/csharp_exception_handling.htmhttp://www.tutorialspoint.com/csharp/csharp_preprocessor_directives.htm
  • 8/10/2019 Re Expression

    2/23

    3public static bool IsMatch( string input, string pattern )Indicates whether the specified regular expression finds a matchin the specified input string.

    4

    public MatchCollection Matches( string input )

    Searches the specified input string for all occurrences of a regularexpression.

    5public string Replace( string input, string replacement )In a specified input string, replaces all strings that match a regularexpression pattern with a specified replacement string.

    6

    public string[] Split( string input )Splits an input string into an array of substrings at the positionsdefined by a regular expression pattern specified in the Regexconstructor.

    For the complete list of methods and properties, please read the Microsoftdocumentation on C#.

    Example 1The following example matches words that start with 'S':

    using System;using System.Text.RegularExpressions;

    namespace RegExApplication{

    class Program{

    private static void showMatch(string text, string expr){

    Console.WriteLine("The Expression: " + expr);MatchCollection mc = Regex.Matches(text, expr);foreach (Match m in mc){

    Console.WriteLine(m);}

    }static void Main(string[] args){

    string str = "A Thousand Splendid Suns";

    Console.WriteLine("Matching words that start with 'S': ");showMatch(str, @"\bS\S*");Console.ReadKey();

    }}

    }

    When the above code is compiled and executed, it produces following result:

  • 8/10/2019 Re Expression

    3/23

    Matching words that start with 'S':The Expression: \bS\S*SplendidSuns

    Example 2The following example matches words that start with 'm' and ends with 'e':

    using System;using System.Text.RegularExpressions;

    namespace RegExApplication{

    class Program{

    private static void showMatch(string text, string expr){

    Console.WriteLine("The Expression: " + expr);MatchCollection mc = Regex.Matches(text, expr);

    foreach (Match m in mc){

    Console.WriteLine(m);}

    }static void Main(string[] args){

    string str = "make maze and manage to measure it";

    Console.WriteLine("Matching words start with 'm' and ends with 'e':");showMatch(str, @"\bm\S*e\b");Console.ReadKey();

    }}

    }

    When the above code is compiled and executed, it produces following result:

    Matching words start with 'm' and ends with 'e':The Expression: \bm\S*e\bmakemazemanagemeasure

    Example 3

    This example replaces extra white space:using System;using System.Text.RegularExpressions;

    namespace RegExApplication{

    class Program{

    static void Main(string[] args)

  • 8/10/2019 Re Expression

    4/23

    {string input = "Hello World ";string pattern = "\\s+";string replacement = " ";Regex rgx = new Regex(pattern);string result = rgx.Replace(input, replacement);

    Console.WriteLine("Original String: {0}", input);Console.WriteLine("Replacement String: {0}", result);Console.ReadKey();

    }}

    }

    When the above code is compiled and executed, it produces following result:

    Original String: Hello WorldReplacement String: Hello World

    C# Regex: Checking for a - z and A - Z

    upvote8down vote favorite

    2

    I want to check if a string inputted in a character between a-z or A-Z.Somehow my regular expression doesn't seem to pick it up. It alwaysreturns true. I am not sure why, I gather it has to do with how I amwriting my regular expression. Any help would be appreciated.

    private static bool isValid( String str){

    bool valid = false ;

    Regex reg = new Regex (( @"a-zA-Z+" ));

    if (reg. Match (str). Success )valid = false ;

    else valid = true ;

    http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://engine.adzerk.net/r?e=eyJhdiI6NDE0LCJhdCI6NCwiY20iOjI5MTEsImNoIjoxMTc4LCJjciI6NTkyMiwiZGkiOiI0ZGYwNzk4MTI1YzU0YjVhOTYzYzY2MjE3YzQ3MjgzNyIsImRtIjoxLCJmYyI6MTAxODcsImZsIjo3NDcyLCJrdyI6ImMjLHJlZ2V4IiwibnciOjIyLCJyZiI6Imh0dHBzOi8vd3d3Lmdvb2dsZS5jby5pbi8iLCJydiI6MCwicHIiOjE2MDQsInN0Ijo4Mjc3LCJ6biI6NDMsInVyIjoiaHR0cDovL2NhcmVlcnMuc3RhY2tvdmVyZmxvdy5jb20vIn0&s=GUZc8k1YbjaX-arhM1VWigP-0cghttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-zhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z
  • 8/10/2019 Re Expression

    5/23

    return valid;

    }c# regex

    share |improve this question edited May 16 '11 at13:04

    jlafay 3,856 32250

    asked M12:58

    Sophie43 13

    1 You're setting it to false after it matches. jlafay May 16 '11 at 13:03

    A TIP: Rather than writing a-zA-Z you can use ?i to make your regex pattern cainsensitive and then just write a-z where ever required. NeverHopeless Nov 1213:02

    3 Answers

    activeoldes tvotes

    upvote5down vote accepted

    The right way would be like so:

    private static bool isValid( String str){

    return Regex . IsMatch (str, @"^[a-zA-Z]+$" );}This code has the following benefits:

    Using the static method instead of creating a new instance everytime: The static method caches the regular expression

    Fixed the regex. It now matches any string that consists of one ormore of the characters a-z or A-Z. No other characters are allowed.

    Much shorter and readable.

    share |improve this answer edited May 16 '11 at13:08

    answereat 13:03

    http://stackoverflow.com/questions/tagged/c%23http://stackoverflow.com/questions/tagged/regexhttp://stackoverflow.com/questions/tagged/regexhttp://stackoverflow.com/questions/tagged/regexhttp://stackoverflow.com/q/6017778http://stackoverflow.com/posts/6017778/edithttp://stackoverflow.com/posts/6017778/edithttp://stackoverflow.com/posts/6017778/edithttp://stackoverflow.com/posts/6017778/revisionshttp://stackoverflow.com/posts/6017778/revisionshttp://stackoverflow.com/posts/6017778/revisionshttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955459_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955459_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955459_6017778http://stackoverflow.com/users/751527/neverhopelesshttp://stackoverflow.com/users/751527/neverhopelesshttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment18210279_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment18210279_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment18210279_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment18210279_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=active#tab-tophttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=votes#tab-tophttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=votes#tab-tophttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=votes#tab-tophttp://stackoverflow.com/a/6017834http://stackoverflow.com/posts/6017834/edithttp://stackoverflow.com/posts/6017834/edithttp://stackoverflow.com/posts/6017834/edithttp://stackoverflow.com/posts/6017834/revisionshttp://stackoverflow.com/posts/6017834/revisionshttp://stackoverflow.com/posts/6017834/revisionshttp://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/posts/6017834/revisionshttp://stackoverflow.com/posts/6017834/revisionshttp://stackoverflow.com/posts/6017834/edithttp://stackoverflow.com/a/6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=votes#tab-tophttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=active#tab-tophttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z?answertab=active#tab-tophttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment18210279_6017778http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment18210279_6017778http://stackoverflow.com/users/751527/neverhopelesshttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955459_6017778http://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/755665/sophie-kerhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/users/179386/jlafayhttp://stackoverflow.com/posts/6017778/revisionshttp://stackoverflow.com/posts/6017778/revisionshttp://stackoverflow.com/posts/6017778/edithttp://stackoverflow.com/q/6017778http://stackoverflow.com/questions/tagged/regexhttp://stackoverflow.com/questions/tagged/c%23
  • 8/10/2019 Re Expression

    6/23

    Daniel Hilgarth75.9k 1078158

    Because of the anchors ^ and $, ^[a-zA-Z]+$ will match a string if it is entirely compose

    of letters (probably what the OT intends, but you should update the explanation).

    Ekkehard.Horner May 16 '11 at 13:10

    @Ekkehard: IMHO, my explanation states exactly that... Daniel Hilgarth May 1613:11

    upvote5down vote

    UseRegex . IsMatch ( @"^[a-zA-Z]+$" );share |improve this answer edited May 16 '11 at

    13:45 answereat 12:59

    mathieu

    15.2k 12 upvote4down vote

    Regex reg = new Regex ( "^[a-zA-Z]+$" ); ^ start of the string [] character set + one time or the more $ end of the string

    ^ and $ needed because you want validate all string, not part of thestring

    share |improve this answer answereat 13:05

    Creating Regular Expressions

    http://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/users/603855/ekkehard-hornerhttp://stackoverflow.com/users/603855/ekkehard-hornerhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955587_6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955587_6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955587_6017834http://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955620_6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955620_6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955620_6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955620_6017834http://stackoverflow.com/a/6017789http://stackoverflow.com/posts/6017789/edithttp://stackoverflow.com/posts/6017789/edithttp://stackoverflow.com/posts/6017789/edithttp://stackoverflow.com/posts/6017789/revisionshttp://stackoverflow.com/posts/6017789/revisionshttp://stackoverflow.com/posts/6017789/revisionshttp://stackoverflow.com/users/971/mathieuhttp://stackoverflow.com/users/971/mathieuhttp://stackoverflow.com/a/6017861http://stackoverflow.com/posts/6017861/edithttp://stackoverflow.com/posts/6017861/edithttp://stackoverflow.com/posts/6017861/edithttp://stackoverflow.com/users/971/mathieuhttp://engine.adzerk.net/r?e=eyJhdiI6NDE0LCJhdCI6NCwiY20iOjg0NywiY2giOjExNzgsImNyIjoxMDc2OCwiZGkiOiI5MmU4OGZjYWYxNGU0ZDk5YTI5MWQ0ZWUxZTQ3MzA3MiIsImRtIjoxLCJmYyI6MTY4ODUsImZsIjoyNDQ0LCJrdyI6ImMjLHJlZ2V4IiwibnciOjIyLCJyZiI6Imh0dHBzOi8vd3d3Lmdvb2dsZS5jby5pbi8iLCJydiI6MCwicHIiOjE1NjgsInN0Ijo4Mjc3LCJ6biI6NDQsInVyIjoiaHR0cDovL2NhcmVlcnMuc3RhY2tvdmVyZmxvdy5jb20vam9icy90ZWxlY29tbXV0ZSJ9&s=XOCI5dl_1gm-NQhoQ7vWFB7eTzchttp://stackoverflow.com/users/971/mathieuhttp://engine.adzerk.net/r?e=eyJhdiI6NDE0LCJhdCI6NCwiY20iOjg0NywiY2giOjExNzgsImNyIjoxMDc2OCwiZGkiOiI5MmU4OGZjYWYxNGU0ZDk5YTI5MWQ0ZWUxZTQ3MzA3MiIsImRtIjoxLCJmYyI6MTY4ODUsImZsIjoyNDQ0LCJrdyI6ImMjLHJlZ2V4IiwibnciOjIyLCJyZiI6Imh0dHBzOi8vd3d3Lmdvb2dsZS5jby5pbi8iLCJydiI6MCwicHIiOjE1NjgsInN0Ijo4Mjc3LCJ6biI6NDQsInVyIjoiaHR0cDovL2NhcmVlcnMuc3RhY2tvdmVyZmxvdy5jb20vam9icy90ZWxlY29tbXV0ZSJ9&s=XOCI5dl_1gm-NQhoQ7vWFB7eTzchttp://stackoverflow.com/posts/6017861/edithttp://stackoverflow.com/a/6017861http://stackoverflow.com/users/971/mathieuhttp://stackoverflow.com/users/971/mathieuhttp://stackoverflow.com/users/971/mathieuhttp://stackoverflow.com/posts/6017789/revisionshttp://stackoverflow.com/posts/6017789/revisionshttp://stackoverflow.com/posts/6017789/edithttp://stackoverflow.com/a/6017789http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955620_6017834http://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955620_6017834http://stackoverflow.com/users/572644/daniel-hilgarthhttp://stackoverflow.com/questions/6017778/c-sharp-regex-checking-for-a-z-and-a-z#comment6955587_6017834http://stackoverflow.com/users/603855/ekkehard-hornerhttp://stackoverflow.com/users/572644/daniel-hilgarth
  • 8/10/2019 Re Expression

    7/23

    Regular expressions are an efficient way to process text. The following regularexpression looks complicated to a beginner:

    Collapse | Copy Code

    ^\w+$

    The PERL developer would smile. All this regular expression does is return the exactsame word entered that the expression is compared to. The symbols look very difficultto understand, and are.The ^ symbol refers to the start of the string. The $ refers tothe end of the string. The \w refers to the a whole word with the characters A-Z, a-z, 0-9 and underscore. The + is simply 0 or more repetitions. The regular expression wouldmatch:

    Collapse | Copy Code

    testtesttesttest11test

    Using Regular Expressions in C# .NETThe System.Text.RegularExpressions namespace contains the Regex class used toform and evaluate regular expressions. The Regex class contains static methods used tocompare regular expressions against strings. The Regex class uses the IsMatch() staticmethod to compare a string with a regular expression.

    Collapse | Copy Code

    bool match = Regex.IsMatch(string input, string pattern);

    If writing C# code, the example above would be:

    if (Regex.IsMatch( "testtest" , @"^\w+$" )){

    // Do something here}

    Another useful static method is Match() , which returns a Match object with all matchesin the input string. This is useful when more than one match exists in the input

    text. The following code results in more than one match:

    Collapse | Copy Code

    string text = "first second" ;string reg = @"^([\w]+) ([\w]+)$" ;

    Match m = Regex.Match(text, reg, RegexOptions.CultureInvariant);

    foreach (Group g in m.Groups){

    http://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NET
  • 8/10/2019 Re Expression

    8/23

    Console.WriteLine(g.Value);}

    The expression groups are entered in parentheses. The example above returns threegroups; the entire text as the first match, the first word, and the second

    word. Expression groups are useful when text needs to broken down and grouped intoseveral pieces of related text for storage orfurther manipulation.

    A Quick ExampleIn this example, we validate an email address using regular expressions. My regularexpressionworks:

    Collapse | Copy Code

    ^((([\w]+\.[\w]+)+)|([\w]+))@(([\w]+\.)+)([A-Za-z]{ 1, 3})$

    However, this isnt the only expression used to validate email addresses. There are atleast two other ways that I have come across. There are many more.

    We write a small C# console application that takes some text as an input, anddetermines if the text is an email address.

    Collapse | Copy Code

    using System.Text;using System.Text.RegularExpressions;

    string text = Console.ReadLine();string reg = @"^((([\w]+\.[\w]+)+)|([\w]+))@(([\w]+\.)+)([A-Za-z]{1,3})$" ;

    if (Regex.IsMatch(text, reg)){

    Console.WriteLine( "Email." );}else {

    Console.WriteLine( "Not email." );}

    Try this with a few real and fake email addresses and see if it works. Let me know ifyou find an error.

    DocumentationRegular expressions are developed differently. The same task can be accomplishedusing many different expressions. Expressions created by a developer may beundecipherable by another.

    http://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NET
  • 8/10/2019 Re Expression

    9/23

    This is why documenting regular expressions is a very important part of thedevelopment process.The expression code comments often span several lines, and isworth the effort in case your expression has unintended effects, or if another developertakes over your code. Enforcing good documentation standards for regular expressionswill ensure that maintenance issues are minimal.

    For example, if we document the regular expression for validating email addressesabove, we would write comments like these:

    Collapse | Copy Code

    // Validating email addresses // // @"^((([\w]+\.[\w]+)+)|([\w]+))@(([\w]+\.)+)([A-Za-z]{1,3})$" // // The expression has three expression // groups. // // 1. ((([\w]+\.[\w]+)+)|([\w]+)) // // The LHS of the or clause states // that there may be more than one // sequence of two words with a . // between them. // // The RHS of the or clause states // that there may be a single word. // // 2. (([\w]+\.)+) // // This expression states that there // may be as many // words separated by a . between them // as necessary.

    // // 3. ([A-Za-z]{1,3}) // // This expression states that the // last set of characters may be upper // or lowercase letters. There must be // a minimum of 1 and a maximum of 3.

    This may be considered a long set of comments for a lot of development standards, butthe expression has been broken down into expression groups. A new developer hasvery little difficulty in understanding the function and motivation behind writing theexpression. This practice should be consistently enforced to avoid headaches whenupgrading or debugging software.

    Useful Regex SoftwareIf youve used a shell script in *NIX, then youve used grep . Windows has thePowerGrep tool, which is similar to grep . PowerShell is a another tool which is built onthe .NET Regular Expression engine, and has command line scripting utilities. Espresso

    http://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NEThttp://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NET
  • 8/10/2019 Re Expression

    10/23

    by UltraPico (www.ultrapico.com) is a free Regular Expression Editor which you can useto build and test your regular expressions.

    Conslusion Regular expressions are an efficient way to search, identify and validate large quantitiesof text without having to write any comparisons. Although they may be complicated,writing and documenting regular expressions allows the developer to concentrate onmore important parts of the implementation process. The use of several free and opensource regular expression tools makes understanding and building regular expressionsa worthwhile task.

    To download this technical article in PDF format, go to the Coactum Solutions websitea thttp://www.coactumsolutions.com/Articles.aspx .

    C# Regex.Match

    Regex.Match searches stringsbased on a pattern. It isolatespart of a string based on thepattern specified. It requires thatyou use the text-processinglanguage for the pattern. Itproves to be useful and effectivein many C# programs.

    String Input and output required for examples

    Input string: /content/some-page.aspx

    Required match: some-page

    http://www.coactumsolutions.com/Articles.aspxhttp://www.coactumsolutions.com/Articles.aspxhttp://www.coactumsolutions.com/Articles.aspxhttp://www.dotnetperls.com/stringhttp://www.dotnetperls.com/stringhttp://www.dotnetperls.com/regexhttp://www.dotnetperls.com/stringhttp://www.coactumsolutions.com/Articles.aspx
  • 8/10/2019 Re Expression

    11/23

    Input string: /content/alternate-1.aspx

    Required match: alternate-1

    Input string: /images/something.png

    Required match: -

    ExampleWe first see how you can matchthe filename in a directory pathwith Regex. This has more

    constraints regarding theacceptable characters than manymethods have. You can see thechar range in the secondparameter to Regex.Match.

    Program that uses Regex.Match: C#

    using System;

    using System.Text.RegularExpressions;

    class Program

    {

    static void Main()

    {

    // First we see the input string. string input = "/content/alternate-1.aspx";

    // Here we call Regex.Match.

    Match match = Regex.Match (input, @"content/([A-Za-z0-9\-]+)\.aspx$",

  • 8/10/2019 Re Expression

    12/23

    RegexOptions.IgnoreCase);

    // Here we check the Match instance.

    if (match.Success){

    // Finally, we get the Group value and display it.

    string key = match.Groups[1].Value;

    Console.WriteLine(key);

    }

    }

    }

    Output

    alternate-1

  • 8/10/2019 Re Expression

    13/23

  • 8/10/2019 Re Expression

    14/23

    here, and we mustremember this.

    ToLower

    Using ToLower instead ofRegexOptions.IgnoreCase on theRegex yielded a 10% or higherimprovement. Since I needed a

    lowercase result, calling the C#string ToLower method first wassimpler.

    ToLower Program that also uses Regex.Match: C#

    using System;

    using System.Text.RegularExpressions;

    class Program

    {

    static void Main()

    {

    // This is the input string.

    string input = "/content/alternate-1.aspx";

    // Here we lowercase our input first.

    input = input.ToLower();

    Match match = Regex.Match (input, @"content/([A-Za-z0-9\-]+)\.aspx$");

    http://www.dotnetperls.com/tolowerhttp://www.dotnetperls.com/tolowerhttp://www.dotnetperls.com/tolower
  • 8/10/2019 Re Expression

    15/23

  • 8/10/2019 Re Expression

    16/23

    }

    static class RegexUtil

    {static Regex _regex = new Regex(@"/content/([a-z0-9\-

    ]+)\.aspx$");

    ///

    /// This returns the key that is matched within the input.

    ///

    static public string MatchKey(string input)

    {

    Match match = _regex. Match (input.ToLower());

    if (match.Success)

    {

    return match.Groups[1].Value;

    }

    else

    {return null;

    }

    }

    }

    Output

    alternate-1

  • 8/10/2019 Re Expression

    17/23

    This static class stores aninstance Regex that can be usedproject-wide. We initialize itinline. The custom method

    exposes a MatchKey method.This is a useful method Ideveloped to return the stringthat we want from the inputvalue.

    Static Class Pattern description. It uses aletter range. In this code I show

    the Regex with the "A-Z" rangeremoved, because the string isalready lowercased. I found thatremoving as many options fromthe Regex as possible boostedperformance.

    Tip: With this code, I found

    that usingRegexOptions.RightToLeftmade the pattern slightlyfaster as well.

    Note: The expressionengine has to evaluate

    http://www.dotnetperls.com/static-classhttp://www.dotnetperls.com/static-classhttp://www.dotnetperls.com/static-class
  • 8/10/2019 Re Expression

    18/23

    fewer characters in thiscase. But this option couldslow down or speed upyour Regex.

    Numbers

    One common requirement isextracting a number from astring. We can do this withRegex.Match. Match handles onlyone number if a string has

    more than one, use insteadRegex.Matches.

    Next: We extract a group ofdigit characters and accessthe Value stringrepresentation of thatnumber.

    Also: To parse thenumber, use int.Parse orint.TryParse on the Valuehere. This will convert it toan int.

  • 8/10/2019 Re Expression

    19/23

    int.Parseint.TryParse Program that uses Match on numbers: C#

    using System;

    using System.Text.RegularExpressions;

    class Program

    {

    static void Main()

    {

    // ... Input string.

    string input = "Dot Net 100 Perls";

    // ... One or more digits.

    Match m = Regex.Match (input, @"\d+");

    // ... Write value.

    Console.WriteLine(m.Value);}

    }

    Output

    100

    Performance

    http://www.dotnetperls.com/int-parsehttp://www.dotnetperls.com/int-parsehttp://www.dotnetperls.com/int-parsehttp://www.dotnetperls.com/int-parse
  • 8/10/2019 Re Expression

    20/23

    You can add theRegexOptions.Compiled flag for asubstantial performance gain atruntime. This will however makeyour program start up slower.With RegexOptions.Compiled wesee often 30% betterperformance.

    RegexOptions.CompiledPerformance

    Summary

    We used Regex.Match. Thismethod extracts a single matchfrom the input string. We canaccess the matched data withthe Value property. And similar

    methods, such as IsMatch andMatches, are often helpful.

    IsMatchMatches

    How to: Search Strings Using

    Regular Expressions (C#Programming Guide)Visual Studio 2008 Other Versions

    http://www.dotnetperls.com/regexoptions-compiledhttp://www.dotnetperls.com/regexoptions-compiledhttp://www.dotnetperls.com/regex-ismatchhttp://www.dotnetperls.com/regex-ismatchhttp://www.dotnetperls.com/regexhttp://www.dotnetperls.com/regex-ismatchhttp://www.dotnetperls.com/regex-ismatchhttp://www.dotnetperls.com/regexoptions-compiledhttp://www.dotnetperls.com/regexoptions-compiled
  • 8/10/2019 Re Expression

    21/23

    The System.Text.RegularExpressions.Regex class can be used to search strings. These searches can rangein complexity from very simple to making full use of regular expressions. The following are two examplesof string searching by using the Regex class. For more information, see .NET Framework RegularExpressions .

    Example

    The following code is a console application that performs a simple case-insensitive search of the strings inan array. The static method Regex.IsMatch performs the search given the string to search and a string thatcontains the search pattern. In this case, a third argument is used to indicate that case should be ignored.For more information, see System.Text.RegularExpressions.RegexOptions . C#class TestRegularExpressions{

    static void Main(){

    string [] sentences ={

    "C# code" ,"Chapter 2: Writing Code" ,"Unicode" ,"no match here"

    };

    string sPattern = "code" ;

    foreach ( string s in sentences){

    System.Console.Write( "{0,24}" , s);

    if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern,System.Text.RegularExpressions.RegexOptions.IgnoreCase))

    {System.Console.WriteLine( " (match for '{0}' found)" , sPattern);

    }else {

    System.Console.WriteLine();}

    }

    // Keep the console window open in debug mode. System.Console.WriteLine( "Press any key to exit." );System.Console.ReadKey();

    http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/hs600312(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/hs600312(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/hs600312(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/hs600312(v=vs.90).aspxhttp://void%280%29/http://void%280%29/http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.ismatch(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.ismatch(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.ismatch(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.ismatch(v=vs.90).aspxhttp://void%280%29/http://msdn.microsoft.com/en-us/library/hs600312(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/hs600312(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.90).aspx
  • 8/10/2019 Re Expression

    22/23

    }}/* Output:

    C# code (match for 'code' found)Chapter 2: Writing Code (match for 'code' found)Unicode (match for 'code' found)no match here

    */

    The following code is a console application that uses regular expressions to validate the format of eachstring in an array. The validation requires that each string take the form of a telephone number in whichthree groups of digits are separated by dashes, the first two groups contain three digits, and the thirdgroup contains four digits. This is done by using the regular expression ^\\d{3}-\\d{3}-\\d{4}$ . Formore information, see Regular Expression Language - Quick Reference . C#class TestRegularExpressionValidation{

    static void Main(){

    string [] numbers ={

    "123-555-0190" ,"444-234-22450" ,"690-555-0178" ,"146-893-232" ,"146-555-0122" ,"4007-555-0111" ,"407-555-0111" ,"407-2-5555" ,

    };

    string sPattern = "^\\d{3}-\\d{3}-\\d{4}$" ;

    foreach ( string s in numbers){

    System.Console.Write( "{0,14}" , s);

    if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern)){

    System.Console.WriteLine( " - valid" );}else {

    System.Console.WriteLine( " - invalid" );}

    }

    // Keep the console window open in debug mode. System.Console.WriteLine( "Press any key to exit." );System.Console.ReadKey();

    http://msdn.microsoft.com/en-us/library/az24scfc(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/az24scfc(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/az24scfc(v=vs.90).aspxhttp://msdn.microsoft.com/en-us/library/az24scfc(v=vs.90).aspx
  • 8/10/2019 Re Expression

    23/23

    }}/* Output:

    123-555-0190 - valid444-234-22450 - invalid

    690-555-0178 - valid146-893-232 - invalid

    146-555-0122 - valid4007-555-0111 - invalid

    407-555-0111 - valid407-2-5555 - invalid

    */