summaryrefslogtreecommitdiff
path: root/python/voluptuous/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'python/voluptuous/README.md')
-rw-r--r--python/voluptuous/README.md596
1 files changed, 596 insertions, 0 deletions
diff --git a/python/voluptuous/README.md b/python/voluptuous/README.md
new file mode 100644
index 0000000000..fd84c64e77
--- /dev/null
+++ b/python/voluptuous/README.md
@@ -0,0 +1,596 @@
+# Voluptuous is a Python data validation library
+
+[![Build Status](https://travis-ci.org/alecthomas/voluptuous.png)](https://travis-ci.org/alecthomas/voluptuous) [![Stories in Ready](https://badge.waffle.io/alecthomas/voluptuous.png?label=ready&title=Ready)](https://waffle.io/alecthomas/voluptuous)
+
+Voluptuous, *despite* the name, is a Python data validation library. It
+is primarily intended for validating data coming into Python as JSON,
+YAML, etc.
+
+It has three goals:
+
+1. Simplicity.
+2. Support for complex data structures.
+3. Provide useful error messages.
+
+## Contact
+
+Voluptuous now has a mailing list! Send a mail to
+[<voluptuous@librelist.com>](mailto:voluptuous@librelist.com) to subscribe. Instructions
+will follow.
+
+You can also contact me directly via [email](mailto:alec@swapoff.org) or
+[Twitter](https://twitter.com/alecthomas).
+
+To file a bug, create a [new issue](https://github.com/alecthomas/voluptuous/issues/new) on GitHub with a short example of how to replicate the issue.
+
+## Show me an example
+
+Twitter's [user search API](https://dev.twitter.com/docs/api/1/get/users/search) accepts
+query URLs like:
+
+```
+$ curl 'http://api.twitter.com/1/users/search.json?q=python&per_page=20&page=1
+```
+
+To validate this we might use a schema like:
+
+```pycon
+>>> from voluptuous import Schema
+>>> schema = Schema({
+... 'q': str,
+... 'per_page': int,
+... 'page': int,
+... })
+
+```
+
+This schema very succinctly and roughly describes the data required by
+the API, and will work fine. But it has a few problems. Firstly, it
+doesn't fully express the constraints of the API. According to the API,
+`per_page` should be restricted to at most 20, defaulting to 5, for
+example. To describe the semantics of the API more accurately, our
+schema will need to be more thoroughly defined:
+
+```pycon
+>>> from voluptuous import Required, All, Length, Range
+>>> schema = Schema({
+... Required('q'): All(str, Length(min=1)),
+... Required('per_page', default=5): All(int, Range(min=1, max=20)),
+... 'page': All(int, Range(min=0)),
+... })
+
+```
+
+This schema fully enforces the interface defined in Twitter's
+documentation, and goes a little further for completeness.
+
+"q" is required:
+
+```pycon
+>>> from voluptuous import MultipleInvalid, Invalid
+>>> try:
+... schema({})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "required key not provided @ data['q']"
+True
+
+```
+
+...must be a string:
+
+```pycon
+>>> try:
+... schema({'q': 123})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "expected str for dictionary value @ data['q']"
+True
+
+```
+
+...and must be at least one character in length:
+
+```pycon
+>>> try:
+... schema({'q': ''})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "length of value must be at least 1 for dictionary value @ data['q']"
+True
+>>> schema({'q': '#topic'}) == {'q': '#topic', 'per_page': 5}
+True
+
+```
+
+"per\_page" is a positive integer no greater than 20:
+
+```pycon
+>>> try:
+... schema({'q': '#topic', 'per_page': 900})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "value must be at most 20 for dictionary value @ data['per_page']"
+True
+>>> try:
+... schema({'q': '#topic', 'per_page': -10})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "value must be at least 1 for dictionary value @ data['per_page']"
+True
+
+```
+
+"page" is an integer \>= 0:
+
+```pycon
+>>> try:
+... schema({'q': '#topic', 'per_page': 'one'})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc)
+"expected int for dictionary value @ data['per_page']"
+>>> schema({'q': '#topic', 'page': 1}) == {'q': '#topic', 'page': 1, 'per_page': 5}
+True
+
+```
+
+## Defining schemas
+
+Schemas are nested data structures consisting of dictionaries, lists,
+scalars and *validators*. Each node in the input schema is pattern
+matched against corresponding nodes in the input data.
+
+### Literals
+
+Literals in the schema are matched using normal equality checks:
+
+```pycon
+>>> schema = Schema(1)
+>>> schema(1)
+1
+>>> schema = Schema('a string')
+>>> schema('a string')
+'a string'
+
+```
+
+### Types
+
+Types in the schema are matched by checking if the corresponding value
+is an instance of the type:
+
+```pycon
+>>> schema = Schema(int)
+>>> schema(1)
+1
+>>> try:
+... schema('one')
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "expected int"
+True
+
+```
+
+### URL's
+
+URL's in the schema are matched by using `urlparse` library.
+
+```pycon
+>>> from voluptuous import Url
+>>> schema = Schema(Url())
+>>> schema('http://w3.org')
+'http://w3.org'
+>>> try:
+... schema('one')
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "expected a URL"
+True
+
+```
+
+### Lists
+
+Lists in the schema are treated as a set of valid values. Each element
+in the schema list is compared to each value in the input data:
+
+```pycon
+>>> schema = Schema([1, 'a', 'string'])
+>>> schema([1])
+[1]
+>>> schema([1, 1, 1])
+[1, 1, 1]
+>>> schema(['a', 1, 'string', 1, 'string'])
+['a', 1, 'string', 1, 'string']
+
+```
+
+### Validation functions
+
+Validators are simple callables that raise an `Invalid` exception when
+they encounter invalid data. The criteria for determining validity is
+entirely up to the implementation; it may check that a value is a valid
+username with `pwd.getpwnam()`, it may check that a value is of a
+specific type, and so on.
+
+The simplest kind of validator is a Python function that raises
+ValueError when its argument is invalid. Conveniently, many builtin
+Python functions have this property. Here's an example of a date
+validator:
+
+```pycon
+>>> from datetime import datetime
+>>> def Date(fmt='%Y-%m-%d'):
+... return lambda v: datetime.strptime(v, fmt)
+
+```
+
+```pycon
+>>> schema = Schema(Date())
+>>> schema('2013-03-03')
+datetime.datetime(2013, 3, 3, 0, 0)
+>>> try:
+... schema('2013-03')
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "not a valid value"
+True
+
+```
+
+In addition to simply determining if a value is valid, validators may
+mutate the value into a valid form. An example of this is the
+`Coerce(type)` function, which returns a function that coerces its
+argument to the given type:
+
+```python
+def Coerce(type, msg=None):
+ """Coerce a value to a type.
+
+ If the type constructor throws a ValueError, the value will be marked as
+ Invalid.
+ """
+ def f(v):
+ try:
+ return type(v)
+ except ValueError:
+ raise Invalid(msg or ('expected %s' % type.__name__))
+ return f
+
+```
+
+This example also shows a common idiom where an optional human-readable
+message can be provided. This can vastly improve the usefulness of the
+resulting error messages.
+
+### Dictionaries
+
+Each key-value pair in a schema dictionary is validated against each
+key-value pair in the corresponding data dictionary:
+
+```pycon
+>>> schema = Schema({1: 'one', 2: 'two'})
+>>> schema({1: 'one'})
+{1: 'one'}
+
+```
+
+#### Extra dictionary keys
+
+By default any additional keys in the data, not in the schema will
+trigger exceptions:
+
+```pycon
+>>> schema = Schema({2: 3})
+>>> try:
+... schema({1: 2, 2: 3})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "extra keys not allowed @ data[1]"
+True
+
+```
+
+This behaviour can be altered on a per-schema basis. To allow
+additional keys use
+`Schema(..., extra=ALLOW_EXTRA)`:
+
+```pycon
+>>> from voluptuous import ALLOW_EXTRA
+>>> schema = Schema({2: 3}, extra=ALLOW_EXTRA)
+>>> schema({1: 2, 2: 3})
+{1: 2, 2: 3}
+
+```
+
+To remove additional keys use
+`Schema(..., extra=REMOVE_EXTRA)`:
+
+```pycon
+>>> from voluptuous import REMOVE_EXTRA
+>>> schema = Schema({2: 3}, extra=REMOVE_EXTRA)
+>>> schema({1: 2, 2: 3})
+{2: 3}
+
+```
+
+It can also be overridden per-dictionary by using the catch-all marker
+token `extra` as a key:
+
+```pycon
+>>> from voluptuous import Extra
+>>> schema = Schema({1: {Extra: object}})
+>>> schema({1: {'foo': 'bar'}})
+{1: {'foo': 'bar'}}
+
+```
+
+#### Required dictionary keys
+
+By default, keys in the schema are not required to be in the data:
+
+```pycon
+>>> schema = Schema({1: 2, 3: 4})
+>>> schema({3: 4})
+{3: 4}
+
+```
+
+Similarly to how extra\_ keys work, this behaviour can be overridden
+per-schema:
+
+```pycon
+>>> schema = Schema({1: 2, 3: 4}, required=True)
+>>> try:
+... schema({3: 4})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "required key not provided @ data[1]"
+True
+
+```
+
+And per-key, with the marker token `Required(key)`:
+
+```pycon
+>>> schema = Schema({Required(1): 2, 3: 4})
+>>> try:
+... schema({3: 4})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "required key not provided @ data[1]"
+True
+>>> schema({1: 2})
+{1: 2}
+
+```
+
+#### Optional dictionary keys
+
+If a schema has `required=True`, keys may be individually marked as
+optional using the marker token `Optional(key)`:
+
+```pycon
+>>> from voluptuous import Optional
+>>> schema = Schema({1: 2, Optional(3): 4}, required=True)
+>>> try:
+... schema({})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "required key not provided @ data[1]"
+True
+>>> schema({1: 2})
+{1: 2}
+>>> try:
+... schema({1: 2, 4: 5})
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "extra keys not allowed @ data[4]"
+True
+
+```
+
+```pycon
+>>> schema({1: 2, 3: 4})
+{1: 2, 3: 4}
+
+```
+
+### Recursive schema
+
+There is no syntax to have a recursive schema. The best way to do it is to have a wrapper like this:
+
+```pycon
+>>> from voluptuous import Schema, Any
+>>> def s2(v):
+... return s1(v)
+...
+>>> s1 = Schema({"key": Any(s2, "value")})
+>>> s1({"key": {"key": "value"}})
+{'key': {'key': 'value'}}
+
+```
+
+### Extending an existing Schema
+
+Often it comes handy to have a base `Schema` that is extended with more
+requirements. In that case you can use `Schema.extend` to create a new
+`Schema`:
+
+```pycon
+>>> from voluptuous import Schema
+>>> person = Schema({'name': str})
+>>> person_with_age = person.extend({'age': int})
+>>> sorted(list(person_with_age.schema.keys()))
+['age', 'name']
+
+```
+
+The original `Schema` remains unchanged.
+
+### Objects
+
+Each key-value pair in a schema dictionary is validated against each
+attribute-value pair in the corresponding object:
+
+```pycon
+>>> from voluptuous import Object
+>>> class Structure(object):
+... def __init__(self, q=None):
+... self.q = q
+... def __repr__(self):
+... return '<Structure(q={0.q!r})>'.format(self)
+...
+>>> schema = Schema(Object({'q': 'one'}, cls=Structure))
+>>> schema(Structure(q='one'))
+<Structure(q='one')>
+
+```
+
+### Allow None values
+
+To allow value to be None as well, use Any:
+
+```pycon
+>>> from voluptuous import Any
+
+>>> schema = Schema(Any(None, int))
+>>> schema(None)
+>>> schema(5)
+5
+
+```
+
+## Error reporting
+
+Validators must throw an `Invalid` exception if invalid data is passed
+to them. All other exceptions are treated as errors in the validator and
+will not be caught.
+
+Each `Invalid` exception has an associated `path` attribute representing
+the path in the data structure to our currently validating value, as well
+as an `error_message` attribute that contains the message of the original
+exception. This is especially useful when you want to catch `Invalid`
+exceptions and give some feedback to the user, for instance in the context of
+an HTTP API.
+
+
+```pycon
+>>> def validate_email(email):
+... """Validate email."""
+... if not "@" in email:
+... raise Invalid("This email is invalid.")
+... return email
+>>> schema = Schema({"email": validate_email})
+>>> exc = None
+>>> try:
+... schema({"email": "whatever"})
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc)
+"This email is invalid. for dictionary value @ data['email']"
+>>> exc.path
+['email']
+>>> exc.msg
+'This email is invalid.'
+>>> exc.error_message
+'This email is invalid.'
+
+```
+
+The `path` attribute is used during error reporting, but also during matching
+to determine whether an error should be reported to the user or if the next
+match should be attempted. This is determined by comparing the depth of the
+path where the check is, to the depth of the path where the error occurred. If
+the error is more than one level deeper, it is reported.
+
+The upshot of this is that *matching is depth-first and fail-fast*.
+
+To illustrate this, here is an example schema:
+
+```pycon
+>>> schema = Schema([[2, 3], 6])
+
+```
+
+Each value in the top-level list is matched depth-first in-order. Given
+input data of `[[6]]`, the inner list will match the first element of
+the schema, but the literal `6` will not match any of the elements of
+that list. This error will be reported back to the user immediately. No
+backtracking is attempted:
+
+```pycon
+>>> try:
+... schema([[6]])
+... raise AssertionError('MultipleInvalid not raised')
+... except MultipleInvalid as e:
+... exc = e
+>>> str(exc) == "not a valid value @ data[0][0]"
+True
+
+```
+
+If we pass the data `[6]`, the `6` is not a list type and so will not
+recurse into the first element of the schema. Matching will continue on
+to the second element in the schema, and succeed:
+
+```pycon
+>>> schema([6])
+[6]
+
+```
+
+## Running tests.
+
+Voluptuous is using nosetests:
+
+ $ nosetests
+
+
+## Why use Voluptuous over another validation library?
+
+**Validators are simple callables**
+: No need to subclass anything, just use a function.
+
+**Errors are simple exceptions.**
+: A validator can just `raise Invalid(msg)` and expect the user to get
+useful messages.
+
+**Schemas are basic Python data structures.**
+: Should your data be a dictionary of integer keys to strings?
+`{int: str}` does what you expect. List of integers, floats or
+strings? `[int, float, str]`.
+
+**Designed from the ground up for validating more than just forms.**
+: Nested data structures are treated in the same way as any other
+type. Need a list of dictionaries? `[{}]`
+
+**Consistency.**
+: Types in the schema are checked as types. Values are compared as
+values. Callables are called to validate. Simple.
+
+## Other libraries and inspirations
+
+Voluptuous is heavily inspired by
+[Validino](http://code.google.com/p/validino/), and to a lesser extent,
+[jsonvalidator](http://code.google.com/p/jsonvalidator/) and
+[json\_schema](http://blog.sendapatch.se/category/json_schema.html).
+
+I greatly prefer the light-weight style promoted by these libraries to
+the complexity of libraries like FormEncode.