![]()
|
Jakarta-ORO 2.0.6 API: Class Perl5Util
org.apache.oro.text.perl
|
Field Summary | |
static int |
SPLIT_ALL
A constant passed to the split() methods indicating
that all occurrences of a pattern should be used to split a string. |
Constructor Summary | |
Perl5Util()
Default constructor for Perl5Util. |
|
Perl5Util(PatternCache cache)
A secondary constructor for Perl5Util. |
Method Summary | |
int |
begin(int group)
Returns the begin offset of the subgroup of the last match found relative the beginning of the match. |
int |
beginOffset(int group)
Returns an offset marking the beginning of the last pattern match found relative to the beginning of the input from which the match was extracted. |
int |
end(int group)
Returns the end offset of the subgroup of the last match found relative the beginning of the match. |
int |
endOffset(int group)
Returns an offset marking the end of the last pattern match found relative to the beginning of the input from which the match was extracted. |
MatchResult |
getMatch()
Returns the last match found by a call to a match(), substitute(), or split() method. |
java.lang.String |
group(int group)
Returns the contents of the parenthesized subgroups of the last match found according to the behavior dictated by the MatchResult interface. |
int |
groups()
|
int |
length()
Returns the length of the last match found. |
boolean |
match(java.lang.String pattern,
char[] input)
Searches for the first pattern match somewhere in a character array taking a pattern specified in Perl5 native format: |
boolean |
match(java.lang.String pattern,
PatternMatcherInput input)
Searches for the next pattern match somewhere in a org.apache.oro.text.regex.PatternMatcherInput instance, taking a pattern specified in Perl5 native format: |
boolean |
match(java.lang.String pattern,
java.lang.String input)
Searches for the first pattern match in a String taking a pattern specified in Perl5 native format: |
java.lang.String |
postMatch()
Returns the part of the input following the last match found. |
char[] |
postMatchCharArray()
Returns the part of the input following the last match found as a char array. |
java.lang.String |
preMatch()
Returns the part of the input preceding the last match found. |
char[] |
preMatchCharArray()
Returns the part of the input preceding the last match found as a char array. |
void |
split(java.util.Collection results,
java.lang.String input)
Splits input in the default Perl manner, splitting on all whitespace. |
void |
split(java.util.Collection results,
java.lang.String pattern,
java.lang.String input)
This method is identical to calling: |
void |
split(java.util.Collection results,
java.lang.String pattern,
java.lang.String input,
int limit)
Splits a String into strings that are appended to a List, but no more than a specified limit. |
java.util.Vector |
split(java.lang.String input)
Deprecated. Use split(Collection results, String input) instead. |
java.util.Vector |
split(java.lang.String pattern,
java.lang.String input)
Deprecated. Use split(Collection results, String pattern, String input) instead. |
java.util.Vector |
split(java.lang.String pattern,
java.lang.String input,
int limit)
Deprecated. Use split(Collection results, String pattern, String input, int limit)
instead. |
int |
substitute(java.lang.StringBuffer result,
java.lang.String expression,
java.lang.String input)
Substitutes a pattern in a given input with a replacement string. |
java.lang.String |
substitute(java.lang.String expression,
java.lang.String input)
Substitutes a pattern in a given input with a replacement string. |
java.lang.String |
toString()
Returns the same as group(0). |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final int SPLIT_ALL
split()
methods indicating
that all occurrences of a pattern should be used to split a string.Constructor Detail |
public Perl5Util(PatternCache cache)
// We know we're going to use close to 50 expressions a whole lot, so // we create a cache of the proper size. util = new Perl5Util(new PatternCacheLRU(50));or
// We're only going to use a few expressions and know that second-chance // fifo is best suited to the order in which we are using the patterns. util = new Perl5Util(new PatternCacheFIFO2(10));
public Perl5Util()
Method Detail |
public boolean match(java.lang.String pattern, char[] input) throws MalformedPerl5PatternException
The[m]/pattern/[i][m][s][x]
m
prefix is optional and the meaning of the optional
trailing options are:
If the input contains the pattern, the org.apache.oro.text.regex.MatchResult
can be obtained by calling getMatch()
.
However, Perl5Util implements the MatchResult interface as a wrapper
around the last MatchResult found, so you can call its methods to
access match information.
pattern
- The pattern to search for.input
- The char[] input to search.MalformedPerl5PatternException
- If there is an error in
the pattern. You are not forced to catch this exception
because it is derived from RuntimeException.public boolean match(java.lang.String pattern, java.lang.String input) throws MalformedPerl5PatternException
The[m]/pattern/[i][m][s][x]
m
prefix is optional and the meaning of the optional
trailing options are:
If the input contains the pattern, the
MatchResult
can be obtained by calling getMatch()
.
However, Perl5Util implements the MatchResult interface as a wrapper
around the last MatchResult found, so you can call its methods to
access match information.
pattern
- The pattern to search for.input
- The String input to search.MalformedPerl5PatternException
- If there is an error in
the pattern. You are not forced to catch this exception
because it is derived from RuntimeException.public boolean match(java.lang.String pattern, PatternMatcherInput input) throws MalformedPerl5PatternException
The[m]/pattern/[i][m][s][x]
m
prefix is optional and the meaning of the optional
trailing options are:
If the input contains the pattern, the
MatchResult
can be obtained by calling getMatch()
.
However, Perl5Util implements the MatchResult interface as a wrapper
around the last MatchResult found, so you can call its methods to
access match information.
After the call to this method, the PatternMatcherInput current offset
is advanced to the end of the match, so you can use it to repeatedly
search for expressions in the entire input using a while loop as
explained in the PatternMatcherInput
documentation.
pattern
- The pattern to search for.input
- The PatternMatcherInput to search.MalformedPerl5PatternException
- If there is an error in
the pattern. You are not forced to catch this exception
because it is derived from RuntimeException.public MatchResult getMatch()
public int substitute(java.lang.StringBuffer result, java.lang.String expression, java.lang.String input) throws MalformedPerl5PatternException
Thes/pattern/replacement/[g][i][m][o][s][x]
s
prefix is mandatory and the meaning of the optional
trailing options are:
Util.substitute()
.
The default is to compute each interpolation independently.
See
Util.substitute()
and Perl5Substitution
for more details on variable interpolation in
substitutions.
when you could more easily write:numSubs = util.substitute(result, "s/foo\\/bar/goo\\/\\/baz/", input);
where the hashmarks are used instead of slashes.numSubs = util.substitute(result, "s#foo/bar#goo//baz#", input);
There is a special case of backslashing that you need to pay attention to. As demonstrated above, to denote a delimiter in the substituted string it must be backslashed. However, this can be a problem when you want to denote a backslash at the end of the substituted string. As of PerlTools 1.3, a new means of handling this situation has been implemented. In previous versions, the behavior was that
"... a double backslash (quadrupled in the Java String) always represents two backslashes unless the second backslash is followed by the delimiter, in which case it represents a single backslash."
The new behavior is that a backslash is always a backslash in the substitution portion of the expression unless it is used to escape a delimiter. A backslash is considered to escape a delimiter if an even number of contiguous backslashes preceed the backslash and the delimiter following the backslash is not the FINAL delimiter in the expression. Therefore, backslashes preceding final delimiters are never considered to escape the delimiter. The following, which used to be an invalid expression and require a special-case extra backslash, will now replace all instances of / with \:
numSubs = util.substitute(result, "s#/#\\#g", input);
result
- The StringBuffer in which to store the result of the
substitutions. The buffer is only appended to.expression
- The Perl5 substitution regular expression.input
- The input on which to perform substitutions.MalformedPerl5PatternException
- If there is an error in
the expression. You are not forced to catch this exception
because it is derived from RuntimeException.public java.lang.String substitute(java.lang.String expression, java.lang.String input) throws MalformedPerl5PatternException
String result; StringBuffer buffer = new StringBuffer(); perl.substitute(buffer, expression, input); result = buffer.toString();
expression
- The Perl5 substitution regular expression.input
- The input on which to perform substitutions.MalformedPerl5PatternException
- If there is an error in
the expression. You are not forced to catch this exception
because it is derived from RuntimeException.substitute(java.lang.StringBuffer, java.lang.String, java.lang.String)
public void split(java.util.Collection results, java.lang.String pattern, java.lang.String input, int limit) throws MalformedPerl5PatternException
The[m]/pattern/[i][m][s][x]
m
prefix is optional and the meaning of the optional
trailing options are:
The limit parameter causes the string to be split on at most the first limit - 1 number of pattern occurences.
Of special note is that this split method performs EXACTLY the same as the Perl split() function. In other words, if the split pattern contains parentheses, additional Vector elements are created from each of the matching subgroups in the pattern. Using an example similar to the one from the Camel book:
produces the Vector containing:split(list, "/([,-])/", "8-12,15,18")
The{ "8", "-", "12", ",", "15", ",", "18" }
Util.split()
method
does NOT implement this particular behavior because it is intended to
be usable with Pattern instances other than Perl5Pattern.
results
- A List
to which the substrings of the input
that occur between the regular expression delimiter occurences
are appended. The input will not be split into any more substrings
than the specified
limit. A way of thinking of this is that only the first
limit - 1
matches of the delimiting regular expression will be used to split the
input.pattern
- The regular expression to use as a split delimiter.input
- The String to split.limit
- The limit on the size of the returned Vector
.
Values <= 0 produce the same behavior as the SPLIT_ALL constant which
causes the limit to be ignored and splits to be performed on all
occurrences of the pattern. You should use the SPLIT_ALL constant
to achieve this behavior instead of relying on the default behavior
associated with non-positive limit values.MalformedPerl5PatternException
- If there is an error in
the expression. You are not forced to catch this exception
because it is derived from RuntimeException.public void split(java.util.Collection results, java.lang.String pattern, java.lang.String input) throws MalformedPerl5PatternException
split(results, pattern, input, SPLIT_ALL);
public void split(java.util.Collection results, java.lang.String input) throws MalformedPerl5PatternException
split(results, "/\\s+/", input);
public java.util.Vector split(java.lang.String pattern, java.lang.String input, int limit) throws MalformedPerl5PatternException
split(Collection results, String pattern, String input, int limit)
instead.
The[m]/pattern/[i][m][s][x]
m
prefix is optional and the meaning of the optional
trailing options are:
The limit parameter causes the string to be split on at most the first limit - 1 number of pattern occurences.
Of special note is that this split method performs EXACTLY the same as the Perl split() function. In other words, if the split pattern contains parentheses, additional Vector elements are created from each of the matching subgroups in the pattern. Using an example similar to the one from the Camel book:
produces the Vector containing:split("/([,-])/", "8-12,15,18")
The{ "8", "-", "12", ",", "15", ",", "18" }
Util.split()
method
does NOT implement this particular behavior because it is intended to
be usable with Pattern instances other than Perl5Pattern.
pattern
- The regular expression to use as a split delimiter.input
- The String to split.limit
- The limit on the size of the returned Vector
.
Values <= 0 produce the same behavior as the SPLIT_ALL constant which
causes the limit to be ignored and splits to be performed on all
occurrences of the pattern. You should use the SPLIT_ALL constant
to achieve this behavior instead of relying on the default behavior
associated with non-positive limit values. Vector
containing the substrings of the input
that occur between the regular expression delimiter occurences. The
input will not be split into any more substrings than the specified
limit. A way of thinking of this is that only the first
limit - 1
matches of the delimiting regular expression will be used to split the
input.MalformedPerl5PatternException
- If there is an error in
the expression. You are not forced to catch this exception
because it is derived from RuntimeException.public java.util.Vector split(java.lang.String pattern, java.lang.String input) throws MalformedPerl5PatternException
split(Collection results, String pattern, String input)
instead.
split(pattern, input, SPLIT_ALL);
public java.util.Vector split(java.lang.String input) throws MalformedPerl5PatternException
split(Collection results, String input)
instead.
split("/\\s+/", input);
public int length()
length
in interface MatchResult
public int groups()
groups
in interface MatchResult
public java.lang.String group(int group)
group
in interface MatchResult
group
- The pattern subgroup to return.public int begin(int group)
begin
in interface MatchResult
group
- The pattern subgroup.public int end(int group)
end
in interface MatchResult
group
- The pattern subgroup.public int beginOffset(int group)
beginOffset
in interface MatchResult
group
- The pattern subgroup.public int endOffset(int group)
endOffset
in interface MatchResult
group
- The pattern subgroup.public java.lang.String toString()
toString
in interface MatchResult
toString
in class java.lang.Object
public java.lang.String preMatch()
public java.lang.String postMatch()
public char[] preMatchCharArray()
public char[] postMatchCharArray()
|
![]() |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |