summary refs log tree commit diff
path: root/src/commandline.lalrpop
diff options
context:
space:
mode:
authorIrene Knapp <ireneista@gmail.com>2020-09-15 20:22:35 -0700
committerIrene Knapp <ireneista@gmail.com>2020-09-15 20:22:35 -0700
commitcca86b496e00605163e96d93cda4b9b248df91fe (patch)
tree3797c1c9ed3072383d7d70179140fe4f4409dac9 /src/commandline.lalrpop
parent320664d6b82229a4a07fa8045dd93f260ca61308 (diff)
Use lalrpop in a trivial way as a proof of concept. Mess with some Unicode character classes.
Diffstat (limited to 'src/commandline.lalrpop')
-rw-r--r--src/commandline.lalrpop16
1 files changed, 16 insertions, 0 deletions
diff --git a/src/commandline.lalrpop b/src/commandline.lalrpop
new file mode 100644
index 0000000..0655281
--- /dev/null
+++ b/src/commandline.lalrpop
@@ -0,0 +1,16 @@
+grammar;
+
+//
+// Z is the unicode class for separators, including line, paragraph, and space
+// separators. C is the class for control characters. P is the class for
+// punctuation. This regexp tests for the intersection of the negation of these
+// character classes, which is any character NOT in one of these three classes.
+//
+// [1] is the official reference, and [2] is a site that is useful for browsing
+// to get an intuitive idea of what these classes mean.
+//
+// [1] http://www.unicode.org/reports/tr44/#General_Category_Values
+// [2] https://www.compart.com/en/unicode/category
+//
+pub Filename: String = <filename:r"[\P{Z}&&\P{C}&&\P{P}]+"> => filename.to_string();
+