This command line tool takes in a list of postal codes in JSON format and generates optimized regular expressions that match the provided codes. The generated regex patterns can be used to validate postal codes in various applications.
- Generates optimized regex patterns for postal code validation
- Handles large sets of postal codes efficiently
- Supports verification of generated patterns
- Configurable regex length limits
- JSON input/output format
- C++ compiler with C++11 support
- Make build system
- nlohmann/json library for C++
You can install PinRex using Homebrew:
brew tap sishir2001/brewery
brew install pinrex
If you prefer to build from source:
-
Clone the repository:
git clone https://github.com/sishir2001/PinRex
-
Navigate to the project directory:
cd PinRex
-
Build the project:
mkdir build cd build cmake --build .
To generate regex patterns based on a list of postal codes, run the following command:
./pinrex -i <input_file_path>.json -o <output_file_path>.json [-l <regex_length_limit>] [--verify]
Options:
-i
: Input JSON file path containing postal codes-o
: Output JSON file path for generated regex patterns-l
: Optional regex length limit (default: 1000)--verify
: Optional flag to verify generated regex patterns--version
: Display version information--help
: Display help message
The input file should be a JSON file containing an array of postal codes under the "postalCodes" key.
Example input file (input.json):
{
"postalCodes": [
110001,
110002,
110003,
110004,
110005
]
}
The output will be a JSON file containing the generated regex patterns that match the input postal codes.
Example output (output.json):
{
"regexes": [
"^11000[1-5]$"
]
}
The generated regex patterns:
- Are anchored with
^
and$
to ensure exact matches - Use character classes
[]
for ranges of digits - Use grouping
()
to capture common prefixes - Use alternation
|
to match different possibilities
You can verify the generated regex patterns using the --verify
flag:
./pinrex -i input.json -o output.json --verify
This will check if:
- The regex patterns only match valid postal codes
- All valid postal codes are matched by the patterns
- No invalid postal codes are matched
The program includes error checking for:
- Invalid input file format
- Missing input/output files
- Invalid JSON structure
- Invalid command-line arguments
- Regex generation failures
The tool is optimized to:
- Handle large sets of postal codes
- Generate compact regex patterns
- Minimize regex backtracking
- Maintain reasonable memory usage
Contributions are welcome! If you find any issues or have suggestions for improvements, please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
Please ensure your code follows the existing style and includes appropriate tests.
This project is licensed under the MIT License.