A spec for URLs in Clojure
Let's get some clojure.spec:
(require '[clojure.spec.alpha :as s]
'[clojure.spec.gen.alpha :as sgen])
Spec has built-in specs for a number of types, including URLs:
(s/valid? uri? "http://conan.is")
=> true
Many of these also include generators:
(sgen/sample (s/gen uri?) 3)
=>
(#object[java.net.URI 0x2c33ebfb "http://c4833483-df0c-42b5-9256-3a13db2639d2.com"]
#object[java.net.URI 0x41621c4e "http://41527909-73d2-4b95-9b5d-b6507c83898b.com"]
#object[java.net.URI 0x1e98c9ff "http://d585f324-7f35-4f3b-a39e-d48ea5489697.com"])
This generator isn't very good - the URLs it generates are valid, but they're all pretty similar. We need a more comprehensive generator that will produce a wide variety of URLs; after all, variety is the spice of generative testing.
URLs consist of many parts:
- protocol
- username
- password
- host
- port
- path
- query
- anchor
Your average URL won't contain them all, but we want our tests to be as comprehensive as possible.
Component generators
Let's write our own generator for URLs, and use it in an improved URL spec. A generator is a no-arg function that returns a clojure.test.check.generators.Generator
.
To construct URLs we can use Chas Emerick's handy url library:
(require '[cemerick.url :as url])
It features a handy constructor for URLs, ->URL
, which takes a vector of the above components and gives us back a java.net.URI
object (Types? Constructors? Objects? OOPs!). We'll need a generator for each of them, but first let's set ourselves up with a generator that makes non-empty alphanumeric strings - we'll be needing a lot of these later:
(defn non-empty-string-alphanumeric
[]
(sgen/such-that #(not= "" %)
(sgen/string-alphanumeric)))
(sgen/generate (non-empty-string-alphanumeric))
=> "j1cf5jDg6toLyP"
The such-that
here takes a predicate and another generator, and filters that generator's output using the predicate. This does mean that some of the generated values are thrown away, which is wasted work, but we just want to filter out the empty strings, and they're rare compared to all possible outputs of string-alphanumeric
.
Protocol
This is easy, we just want to allow http
and https
for now, so we can use elements
to select randomly from a set:
(sgen/elements #{"http" "https"})
Username, password, host
These are just strings, and surprisingly they can be empty. We can use string-alphanumeric
.
Port
Ports can be any integer from 1 to 65535, and the choose
function lets us create generators which select from a range of numbers:
(sgen/choose 1 65535)
Path
This is where things get interesting. The path section of a URL is a sequence of non-empty strings separated by forward slashes. We can make more complex generators like this by combining simpler ones.
Now that we have a generator that can create non-empty strings, we can easily create non-empty vectors of non-empty url-encoded strings, and separate them with slashes:
(sgen/fmap #(->> %
(interleave (repeat "/"))
(apply str))
(sgen/not-empty
(sgen/vector
(non-empty-string-alphanumeric))))
Starting from the bottom, we're using our non-empty-url-encoded-string-alphanumeric
generator and passing it to vector
to get a vector of non-empty strings. We're ensuring the vector itself is not-empty
. Finally we're applying a function to the generated vector using fmap
which interleaves the strings with forward slashes, and gives us back the whole lot as a single string.
Query
To generate query params, we need a map of random keys and values, which we can create using map
. IT takes two generators, one for the keys and one for the values, and a map of options:
(sgen/map
non-empty-string-alphanumeric
non-empty-string-alphanumeric
{:max-elements 3})
Anchor
Anchors can be empty, so we use string-alphanumeric
.
URL generator
Here's how our full URL generator looks:
(defn url-gen
"Generator for generating URLs; note that it may generate
http URLs on port 443 and https URLs on port 80, and only
uses alphanumerics"
[]
(sgen/fmap
(partial apply (comp str url/->URL))
(sgen/tuple
;; protocol
(sgen/elements #{"http" "https"})
;; username
(sgen/string-alphanumeric)
;; password
(sgen/string-alphanumeric)
;; host
(sgen/string-alphanumeric)
;; port
(sgen/choose 1 65535)
;; path
(sgen/fmap #(->> %
(interleave (repeat "/"))
(apply str))
(sgen/not-empty
(sgen/vector
(non-empty-string-alphanumeric))))
;; query
(sgen/map
(non-empty-string-alphanumeric)
(non-empty-string-alphanumeric)
{:max-elements 2})
;; anchor
(sgen/string-alphanumeric))))
We've used tuple
to generate the vector of elements we need using each of our generators, and passed that vector to our URL constructor. As the last step we turn it into a string.
We can now generate random URLs:
(sgen/generate (url-gen))
=> "https://S5xusj6zS:Up9S786kGF@1QK4956802NQuGZE4vgq7Q5w689v:61603/a4jWg250687kTTS9iA3FCXLKbxT1/5aa2Pzlg0Xg9Z5gFd22v09r3/507Q838m1513t339sXeCYuhSU2RV/63HP3s0Lw9BeTgDL7?u0X1VI4hPy7P392yY8Jn4e9L394lg4=LRiDy9zLyi6MBb2J#"
URL Spec
We can now create a spec using our URL generator:
(s/def ::url (s/with-gen
(s/and string?
#(try
(url/url %)
(catch Throwable t false)))
url-gen))
with-gen
takes a spec and a no-args function that returns a generator (for why?). Our spec has a short circuit to check we're dealing with a string, and then tries to construct a URL from it, failing if an exception is thrown.
We can now validate URLs:
(s/valid? ::url "http://conan.is")
=> true
You can find the code here.
Neat!