Create custom datatypes using Pydantic module in Python

Last Updated : 16 Aug, 2021

Many times, we find that we need to pass in a long list of variables to a function, and specifying all of that in the Python function signature can be a bit messy. Again, problems also arise when you want some kind of validation to variables passed. For a long list of variables, it is really difficult to keep validating the data inside the main function body, also it is not a good practice. In that scenario, what you want to do is separate and segregate your variables into different classes. Here, we are going to demonstrate how can use pydantic to create models along with your custom validations. First, let's discuss the use case.

Consider, we are receiving some data from an API call, and we need to do some kind of analysis on it. Typically, an API response will send the response in form of JSON, so we would want our models to be able to serialize and deserialize JSON (1).

Also, we would assume types of certain variables. For example, if we are passing an address, we would assume the pincode to be an Integer value. This is type checking (2).

To perform analysis, you would make some assumptions on the data, like say, pincode should match with the district name provided. This is validation (3).

We might also assume that for certain fields like states, it should be within a list of states say in India, and not any random arbitrary value. This falls under cleaning (4).

So, with these four requirements, let's start coding out mode. I would assume that you have python installed on your system. To, install pydantic simply run,

pip install pydantic

With that set, create a file called models.py and paste the below code in it. We have added detailed in-line comments in the code itself to make it easier to understand directly.

Python3

# import required modules
from enum import Enum
from typing import Optional
from pydantic import BaseModel, PositiveInt, validator, root_validator, constr



# custom class used as choices for state
# pydantic choices using the built-in Enum of python
# which reduces the need for additional packages
class StateTypes(str, Enum):

    DELHI = "DLH"
    UTTAR_PRADESH = "UP"
    BENGALURU = "BLR"
    WEST_BENGAL = "WB"

    
    
# class to get personal credentials
class PersonalDetails(BaseModel):
    id: int
    
    # constr gives us the ability to specify
    # the min and max length
    name: constr(min_length=2, max_length=15
    phone: PositiveInt

    # validation at field level
    @validator("phone")
    
    # get phone number
    def phone_length(cls, v):
        
        # phone number should typically be of length 10
        if len(str(v)) != 10:
            raise ValueError("Phone number must be of ten digits")
        return v


                 
# class to get address                 
class Address(BaseModel):
    id: int
    address_line_1: constr(max_length=50)
    
    # assigning some fields to be optional
    address_line_2: Optional[constr(max_length=50)] = None
    pincode: PositiveInt
    city: constr(max_length=30)
    
    # using choices in python is this simple.
    # Just create a class with Enums as choices
    # and the pass the class as type for the field
    state: StateTypes

    @validator("pincode")
    def pincode_length(cls, v):
        if len(str(v)) != 6:
            raise ValueError("Pincode must be of six digits")
        return v

                 

# using BaseModels as custom datatypes
# in the User class                 
class User(BaseModel):
    
    personal_details: PersonalDetails
    address: Address

    @root_validator(skip_on_failure=True)
    
    # skip_on_failure=True means it will skip the 
    # validation for this class if it's custom 
    # fields are not validated
    def check_id(cls, values):
        
        # custom validation ensuring personal_details.id is 
        # same as address.id
        personal_details: PersonalDetails = values.get("personal_details")
        address: Address = values.get("address")
        if personal_details.id != address.id:
            raise ValueError(
                "ID field of both personal_details as well as address should match"
            )
        return values

                 

# Driver Code 
if __name__ == "__main__":

# testing models                 
    validated_data = {
        "personal_details": {
            "id": 1,
            "name": "GeeksforGeeks",
            "phone": 9999999999,
        },
        "address": {
            "id": 1,
            "address_line_1": "Sector- 136",
            "pincode": 201305,
            "city": "Noida",
            "state": "UP",
        },
    }
    
    # this would work without any error as
    # no validation will fail
    user = User(**validated_data)
    
    # would print the standard __str__ value for the model
    print(user)

    unvalidated_data = {
        "personal_details": {
            "id": 1,
            "name": "GeeksforGeeks",
            "phone": 9999999999,
        },
        "address": {
            "id": 2,
            "address_line_1": "Sector- 136",
            "pincode": 201305,
            "city": "Noida",
            "state": "UP",
        },
    }
    
    # this would raise a value error since the IDs
    # are different
    user = User(**unvalidated_data)
    print(user)

Output:

Upon running this, the first print statement will get executed successfully but in the next initialization of the User model, it would throw ValidationError of type ValueError because IDs of both personal details and address does not match.

debdutgoswami

Improve

Article Tags :

Create custom datatypes using Pydantic module in Python

Explore

Python Fundamentals

Python Data Structures

Advanced Python

Data Science with Python

Web Development with Python

Python Practice

Thank You!

What kind of Experience do you want to share?