Below is an implementation of a Regular Expression HashMap. It works with key-value pairs which the key is a regular expression. It compiles the key (regular expression) while adding (i.e. putting), so there is no compile time while getting. Once getting an element, you don't give regular expression; you give any possible value of a regular expression.
As a result, this behaviour provides to map numerous values of a regular expression into the same value. The class does not depend to any external libraries, uses only default java.util. So, it will be used simply when a behaviour like that is required.
- import java.util.ArrayList;
- import java.util.HashMap;
- import java.util.regex.Pattern;
- /**
- * This class is an extended version of Java HashMap
- * and includes pattern-value lists which are used to
- * evaluate regular expression values. If given item
- * is a regular expression, it is saved in regexp lists.
- * If requested item matches with a regular expression,
- * its value is get from regexp lists.
- *
- * @author cb
- *
- * @param <K> : Key of the map item.
- * @param <V> : Value of the map item.
- */
- public class RegExHashMap<K, V> extends HashMap<K, V> {
- // list of regular expression patterns
- private ArrayList<Pattern> regExPatterns = new ArrayList<Pattern>();
- // list of regular expression values which match patterns
- private ArrayList<V> regExValues = new ArrayList<V>();
- /**
- * Compile regular expression and add it to the regexp list as key.
- */
- @Override
- public V put(K key, V value) {
- regExPatterns.add(Pattern.compile(key.toString()));
- regExValues.add(value);
- return value;
- }
- /**
- * If requested value matches with a regular expression,
- * returns it from regexp lists.
- */
- @Override
- for (int i = 0; i < regExPatterns.size(); i++) {
- if (regExPatterns.get(i).matcher(cs).matches()) {
- return regExValues.get(i);
- }
- }
- return super.get(key);
- }
- }
You need to also Remove the patterns or this could become a memory leak.
ReplyDeleteMemory leak can not seem to be possible in this design. Can you explain the problem more detailed?
DeleteYou are mixing the data structures. Hashmap is to be a data structure whose complexity is almost constant i.e. O(1), considering collisions arising from hash function. If you put a for statement inside the get method, the access time complexity turns into O(n), which is the same complexity of an ordinary list. It would be better use an apart Map for data and another one for save compiled regex patterns (like a pool) and use them combined to accomplish the desired task. Data structures are serious science. Don't invent a new one without serious research.
ReplyDeleteYou are right about O(1), but this is not a standard HashMap. RegExp part of HashMap requires O(n) of course. This implementation provides an easy and encapsulated way of mapping expressions of patterns to values. Reducing O(n) complexity is another problem and there may be many different solutions about that.
DeleteWhy is this even called a Map implementation or more precisely a HashMap. This has nothing to do with a Map structure. You could have just as easily looked at HashMap.java and study their implementation prior to attempting this. There are probably some uses for your code but it should not muddy the waters for those looking to learn, but thank you for your effort.
ReplyDeleteIs doesn't completely work as a Java HashMap in the underlying structure as stated before, but its usage is the same as a HashMap for the user because of encapsulation.
DeleteAnd what is more to the point, it maps keys to values.
The get method would return the value for the first matched key which seems to be a bug as if there could be more matching keys in the regExPatterns list that would get skipped out. Thus extending a HashMap doesn't seem to be a good idea. Also was wondering how super.get(key) is supposed to work if the overriding put method is not adding the key/values to the parent.
ReplyDeleteIt would be better use an apart Map for information and another one for preserve collected regex styles and use them mixed to achieve the preferred process.
ReplyDeleteThe concept of a map from my point of view is to have random access, what I mean is that you could guess where is something without checking all the collection.
ReplyDeleteWhat I think you want is to recover from a collection an instance that match a Regexp so, maybe you could use a List of Objects and a Comparator where the compare sentence uses a Regexp . Then you could do a Collections.binarySearch() to find the target value.
An example of use could be to handle events from a state machine so you could have some concrete error handlers and one generic one (so if the concrete one match then it will handle the event but if there is no specific handler then the general handler will manage the event)... something like error.login handler and an error.* handler.