Create custom datatypes using Pydantic module in Python
Last Updated :
16 Aug, 2021
Many times, we find that we need to pass in a long list of variables to a function, and specifying all of that in the Python function signature can be a bit messy. Again, problems also arise when you want some kind of validation to variables passed. For a long list of variables, it is really difficult to keep validating the data inside the main function body, also it is not a good practice. In that scenario, what you want to do is separate and segregate your variables into different classes. Here, we are going to demonstrate how can use pydantic to create models along with your custom validations. First, let's discuss the use case.
Consider, we are receiving some data from an API call, and we need to do some kind of analysis on it. Typically, an API response will send the response in form of JSON, so we would want our models to be able to serialize and deserialize JSON (1).
Also, we would assume types of certain variables. For example, if we are passing an address, we would assume the pincode to be an Integer value. This is type checking (2).
To perform analysis, you would make some assumptions on the data, like say, pincode should match with the district name provided. This is validation (3).
We might also assume that for certain fields like states, it should be within a list of states say in India, and not any random arbitrary value. This falls under cleaning (4).
So, with these four requirements, let's start coding out mode. I would assume that you have python installed on your system. To, install pydantic simply run,
pip install pydantic
With that set, create a file called models.py and paste the below code in it. We have added detailed in-line comments in the code itself to make it easier to understand directly.
Python3
# import required modules
from enum import Enum
from typing import Optional
from pydantic import BaseModel, PositiveInt, validator, root_validator, constr
# custom class used as choices for state
# pydantic choices using the built-in Enum of python
# which reduces the need for additional packages
class StateTypes(str, Enum):
DELHI = "DLH"
UTTAR_PRADESH = "UP"
BENGALURU = "BLR"
WEST_BENGAL = "WB"
# class to get personal credentials
class PersonalDetails(BaseModel):
id: int
# constr gives us the ability to specify
# the min and max length
name: constr(min_length=2, max_length=15
phone: PositiveInt
# validation at field level
@validator("phone")
# get phone number
def phone_length(cls, v):
# phone number should typically be of length 10
if len(str(v)) != 10:
raise ValueError("Phone number must be of ten digits")
return v
# class to get address
class Address(BaseModel):
id: int
address_line_1: constr(max_length=50)
# assigning some fields to be optional
address_line_2: Optional[constr(max_length=50)] = None
pincode: PositiveInt
city: constr(max_length=30)
# using choices in python is this simple.
# Just create a class with Enums as choices
# and the pass the class as type for the field
state: StateTypes
@validator("pincode")
def pincode_length(cls, v):
if len(str(v)) != 6:
raise ValueError("Pincode must be of six digits")
return v
# using BaseModels as custom datatypes
# in the User class
class User(BaseModel):
personal_details: PersonalDetails
address: Address
@root_validator(skip_on_failure=True)
# skip_on_failure=True means it will skip the
# validation for this class if it's custom
# fields are not validated
def check_id(cls, values):
# custom validation ensuring personal_details.id is
# same as address.id
personal_details: PersonalDetails = values.get("personal_details")
address: Address = values.get("address")
if personal_details.id != address.id:
raise ValueError(
"ID field of both personal_details as well as address should match"
)
return values
# Driver Code
if __name__ == "__main__":
# testing models
validated_data = {
"personal_details": {
"id": 1,
"name": "GeeksforGeeks",
"phone": 9999999999,
},
"address": {
"id": 1,
"address_line_1": "Sector- 136",
"pincode": 201305,
"city": "Noida",
"state": "UP",
},
}
# this would work without any error as
# no validation will fail
user = User(**validated_data)
# would print the standard __str__ value for the model
print(user)
unvalidated_data = {
"personal_details": {
"id": 1,
"name": "GeeksforGeeks",
"phone": 9999999999,
},
"address": {
"id": 2,
"address_line_1": "Sector- 136",
"pincode": 201305,
"city": "Noida",
"state": "UP",
},
}
# this would raise a value error since the IDs
# are different
user = User(**unvalidated_data)
print(user)
Output:
Execution Screenshot
Upon running this, the first print statement will get executed successfully but in the next initialization of the User model, it would throw ValidationError of type ValueError because IDs of both personal details and address does not match.
Similar Reads
How to create a dataset using PyBrain? In this article, we are going to see how to create a dataset using PyBrain. Dataset Datasets are the data that are specifically given to test, validate and train on networks. Instead of troubling with arrays, PyBrain provides us with a more flexible data structure using which handling data can be qu
3 min read
How to create modules in Python 3 ? Modules are simply python code having functions, classes, variables. Any python file with .py extension can be referenced as a module. Although there are some modules available through the python standard library which are installed through python installation, Other modules can be installed using t
4 min read
Make Every Field Optional With Pydantic in Python We are given a task to make every field optional with Pydantic. In this article, we will see how to make every field as optional with Pydantic. Pydantic Models: Python classes are used to define Pydantic models. These models often include fields that are mandatory by default. However, you may use Py
2 min read
Introduction to Python Typing-Extensions Module The typing-extensions module provides backports of the latest typing features to ensure that developers working with older versions of Python can still leverage these advanced tools. This module acts as a bridge between future releases of Python and existing codebases, enabling us to stay up to date
8 min read
Data Classes in Python | Set 4 (Inheritance) Prerequisites: Inheritance In Python, Data Classes in Python | Set 3 In this post, we will discuss how DataClasses behave when inherited. Though they make their own constructors, DataClasses behave pretty much the same way as normal classes do when inherited. Python3 1== from dataclasses import data
2 min read
Data Classes in Python | An Introduction dataclass module is introduced in Python 3.7 as a utility tool to make structured classes specially for storing data. These classes hold certain properties and functions to deal specifically with the data and its representation.DataClasses in widely used Python3.6Â Although the module was introduced
3 min read
Python | Pandas Index.astype() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas Index.astype() function create an Index with values cast to dtypes. The class o
2 min read
Introduction to Python Pydantic Library In modern Python development, data validation and parsing are essential components of building robust and reliable applications. Whether we're developing APIs, working with configuration files, or handling data from various sources, ensuring that our data is correctly validated and parsed is crucial
6 min read
Creating Custom Tag in Python PyYAML YAML, or YAML Markup Language is a data interchange format that is as readable as a text file, and one of the relations of JSON and XML. PyYAML is a YAML parser/ emitter library for Python that can handle parsing as well as the emission of YAML documents. Another nice feature of PyYAML is its abilit
6 min read
Get the data type of column in Pandas - Python Letâs see how to get data types of columns in the pandas dataframe. First, Letâs create a pandas dataframe. Example: Python3 # importing pandas library import pandas as pd # List of Tuples employees = [ ('Stuti', 28, 'Varanasi', 20000), ('Saumya', 32, 'Delhi', 25000), ('Aaditya', 25, 'Mumbai', 40000
3 min read