0% found this document useful (0 votes)
17 views23 pages

Project Proposal

Uploaded by

Mukund Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views23 pages

Project Proposal

Uploaded by

Mukund Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Per

Field-instance
Lookups
GSOC 2022
]\

Table of Contents
1. Abstract 2

2. Django Lookups 3
2.1 Django Lookup API 3
2.2 Custom Lookups 4
2.3 RegisterLookupMixin 5

3. Limitations of current 6

4. Goals and Objectives 8


4.1 Overview 8
4.2 Register new lookups for an instance 8
4.3 Unregister lookups for an instance 9
4.4 Register/ Unregister lookups for a Model 9

5. Proposed Implementation 10
5.1 Overview 10
5.2 Names for methods 10
5.3 Differentiating class calls and instance calls 11
5.4 Storing lookups 12
5.5 get_lookups() 13
5.6 get_lookup() 14
5.7 register_lookup() and unregister_lookup() 14
5.8 Per-model custom lookups 16
16
5.9 Additional Features (optional) 17

6. Timeline 18
6.1 Phase 1 - June 13 to July 25 (6 weeks) 18
6.2 Phase 2 - July 25 to September 12(tentative) 19
6.3 Pre-GSOC and community bonding period 21

7. Previous work 21

About Me 22

1
]\

1. Abstract
This proposal aims to add support to register and unregister custom
lookups on a model’s Field instances rather than on a model’s Field class.

As of now, the Lookup API can only register and unregister Field
classes. So a custom lookup added or overridden to a field affects all models
which have that field. This limits the customizability of model lookups. In
some cases, a user would like to have a custom lookup for one specific model.

This would be a great addition to Django lookups that will allow users
to further customize their model’s lookups.

Note: The code written throughout is just to convey my ideas - the actual
implementation might be different.

2
]\

2. Django Lookups
2.1 Django Lookup API
The Lookup API constructs the WHERE clause of an SQL query. The
Lookup API consists of two parts - ‘RegisterLookupMixin’ class which registers
the lookup(s) and a Query expression API which translates the lookup into
SQL expressions.

The lookup expression is passed as an argument in filter(returns all


items that match the lookup), exclude(returns all items that don’t match the
lookup - opposite of filter) or get(return first matching item).

The lookup expression consists of two parts (three if a transform exists) -

● The field on which the WHERE clause is applied - this could be an


instance of any Field object.
● A transform - alters a field (optional)
● The lookup - some inbuilt lookups include - ‘exact’, ‘contains’, ‘gt’
(greater than), ‘lte’ (less than or equal to).etc.

An example query using a lookup expression could be like -

>>> Employee.objects.filter(salary__gt=20000)

3
]\

2.2 Custom Lookups


Custom lookups allow users to make their lookups for special needs.
This could be done as follows -

from django.db.models import Lookup

class LessEqual(Lookup):
lookup_name = 'le'

def as_sql(self, compiler, connection):


lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return '%s <= %s' % (lhs, rhs), params

The lookup is then registered by -

from django.db.models import Field


Field.register_lookup(LessEqual)

The above custom lookup is registered to the Field class. This applies the
lookup to all fields which inherit from Field class (IntergerField, CharField,
etc.). But realistically, you would want this lookup to be used only for an
IntergerField.

The lookup can be applied to IntergerField by -

from django.db.models import IntegerField


IntegerField.register_lookup(LessEqual)

4
]\

2.3 RegisterLookupMixin
A mixin that implements the lookup API on a class. It has the following
methods -

register_lookup()
Registers a custom lookup in the class.

get_lookup()
Returns the Lookup named ‘lookup_name’ registered in the class. The default
implementation looks recursively on all parent classes and checks if any has a
registered lookup named lookup_name, returning the first match.

get_lookups()
Returns a dictionary of each lookup name registered in the class mapped to
the Lookup class.

get_transform()
Returns a Transform named ‘transform_name’.

5
]\

3. Limitations of current
implementation
RegisterLookupMixin provides support for registering lookups only for the
Field class or its subclasses (CharField, IntegerField, DateTimeField, etc). This
was done because it was easier to implement.

For example in this Model -

from django.db import models


class Employee(models.Model):
name = models.CharField(max_length=40)
department = models.CharField(max_length=40)
age = models.IntegerField()

If we want the ‘name’ field to have a custom lookup, we would do something


like this -

Registering a custom lookup on CharField -

from django.db.models import lookup

@CharField.register_lookup
class NotEqual(Lookup):
lookup_name = 'nte'
def as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return '%s <> %s' % (lhs, rhs), params

6
]\

This, however, would also affect the ‘department’ field (because it's also a
CharField) and also any other Model which has a CharField attribute. This
deteriorates the ability of the custom lookup system in Django.

Similarly unregistering a lookup would disable the lookup for all field classes
in all models created. Same for overriding lookups.

Adding the per-instance field lookup field would be a great addition to the
Django lookup system.

7
]\

4. Goals and Objectives


4.1 Overview
The purpose of providing per-instance custom lookups support is to -

● Create (register) custom lookups based on field instances making the


lookup system more customizable for each model.
● Override existing lookups of a model’s field instances if a user wishes to
change the behaviour of an existing lookup of a particular field class.

Disable (unregister) costly lookups, (for example - ‘contains’ or ‘icontains’ in


CharField) as pointed out in ticket #29799.

4.2 Register new lookups for an instance


Registering new lookups for a Field instance like ‘age’ IntegerField would
look something like this -

from django.db.models import Lookup

class AbsoluteValueLessThan(Lookup):
lookup_name = 'lt'

def as_sql(self, qn, connection):


lhs, lhs_params = qn.compile(self.lhs.lhs)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params + lhs_params + rhs_params
return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params
If the lookup_name in this case is “lt”, already exists, it is overridden by the
current lookup. New lookups and overriding lookups can be done by

8
]\

registration of the lookup (overriding lookup needs to have the same


lookup_name as the lookup that needs to be overridden).

If we would want to override a lookup for the age instance of the Employee
class, it would be possible. The syntax would be as follows after the code
implementation is done.

age.register_lookup(AbsoluteLessThan)

4.3 Unregister lookups for an instance


To unregister the lookup from the existing pool of lookups of an instance, for
example, disabling the lookup contains from the name instance of the
Employee model the following could be done.

name.unregister_lookup(Contains)

This would enable users to disable costly lookups for third-party applications
that don't require contains.

4.4 Register/ Unregister lookups for a Model


A way to register or unregister lookups per-model wise would also be a good
addition to the Django lookup API. This approach to register a lookup for the
Employee model would look like this -

Employee.register_lookup(Field, lookup)

I think this would be the best approach for users who need custom lookups
for every model, not necessarily for every instance. The same thing could be
done with unregistering lookups as well.

9
]\

5. Proposed Implementation
5.1 Overview
Two WIP pull requests are already implemented in #11408 and
charettes@ccc58bf. Both the implementations are good but different. I prefer
charettes’ implementation and think it would fit the idea I have in mind.
Some tweaking and additional improvements and optimization are all it
needs. Apart from registering and unregistering lookups per field instance, I
would also like to extend support for Registering lookups for Models. I believe
it would be a good addition.

5.2 Names for methods


First things first, let's think about the naming of the methods. It would be
unwise to have a separate method for registering instances and classes.
Something like -

Field.register_class_lookup(lookup)

For registering class lookups and,

Field.register_instance_lookup(lookup)

For registering instance lookups. This is not ideal.

10
]\

This could be not such a good idea because - the existing method for
registering lookups would be changed (register_lookup to
register_class_lookup), thus affecting users getting used to the change. It is
possible to implement a common method register_lookup that registers
classes or instances based on its arguments.

5.3 Differentiating class calls and instance calls


This part is already done in charettes@ccc58bf through the
classorinstancemethod class. When called this class redirects to different
methods depending on what calls it. It returns a partial function called
register_lookup. This would merge the separated methods for classes and
instances and would execute the correct method depending on if it was
called by a class or an instance.

It goes like this -

class classorinstancemethod(object):
def __init__(self, class_method, instance_method):
self.class_method = class_method
self.instance_method = instance_method

def __get__(self, instance, owner):


if instance is None:
return functools.partial(self.class_method, owner)
return functools.partial(self.instance_method, self=instance)

Registration, unregistration and getting lookups can be done by this partial


function register_lookup, get_lookups and get_lookups respectively.

11
]\

register_lookup = classorinstancemethod(register_class_lookup,
register_instance_lookup)

_unregister_lookup = classorinstancemethod(_unregister_class_lookup,
_unregister_lookup)

get_lookups = classorinstancemethod(get_class_lookups, get_lookups)

This would allow all using methods to have the same method names without
harming the current implementation. We would have separate methods for
class calls and instance calls merged by the classorinstancemethod class.

5.4 Storing lookups

For Classes

This is done by a class variable class_lookups which enables all child classes
to possess this data. class_lookups is a dictionary that maps lookup_names to
lookups.

For Instances

Instance lookups must not be stored in a class variable, else it won’t be


unique. The best possible way to store it would be a normal class attribute.
This could be called instance_lookups which is a dictionary that maps
lookup_names to lookups.

12
]\

5.5 get_lookups()
get_lookups() method would have to have two separate methods for getting
instance lookups and getting class lookup.

5.5.1 get_class_lookups()

This method is used currently to get class lookups. It just needs to be


renamed. This would fetch the lookups in the class variable class_lookups.

5.5.2 get_instance_lookups()

This would fetch all the lookups present in the instance. This would be -
● All the class_lookups of the class of the instance.
● Extra registered instance_lookups for that particular instance.
So it would look something like this -

def get_instance_lookups(self):
# get class lookups
class_lookups = self.get_class_lookups()
# get instance lookups
instance_lookups_only = getattr(self, instance_lookups)
# merging dictionaries
instance_lookups = {**class_lookups, **instance_lookups_only}
return instance_lookups

Thus all the lookups of the instance are returned.

13
]\

5.6 get_lookup()
This method would return a lookup named lookup_name. This also needs to
have separate methods for classes and instances merged by get_lookup()
partial method return by classorinstancemethod class.

5.6.1 get_class_lookup()

This method a lookup of a lookup_name. It just needs to be renamed.

5.6.2 get_instance_lookup()

This method is would be similar to the above method but would called by an
instance. Support for this could also prove useful (not done in
charettes@ccc58bf).

5.7 register_lookup() and unregister_lookup()


This method would register the lookup to the class or instance (decided by
classorinstancemethod). This however would also require two separate
methods for class and instances for classorinstancemethod to choose from.

5.7.1 register_class_lookup() and unregister_class_lookup

register_class_lookup is a class method which creates a new a key, value pair


{lookup_name: lookup} in the class variable dictionary class_lookups.

Unregister_class_lookup would delete the {lookup_name: lookup} item from


the class variable dictionary class_lookups.

14
]\

5.7.2. register_instance_lookup() & unregister_instance_lookup

register_instance_lookup would create a new a key, value pair {lookup_name:


lookup} in the class attribute dictionary instance_lookups.

Implementation could be like -

def register_instance_lookup(self, lookup, lookup_name=None):


if lookup_name is None:
lookup_name = lookup.lookup_name
if not hasattr(self, 'instance_lookups'):
self.instance_lookups = {}
self.instance_lookups[lookup_name] = lookup

unregister_instance_lookup would be deleting the item {lookup_name:


lookup} from the class attribute instance_lookups.

5.8 Per-model custom lookups


In addition to adding support for per-field-instance custom lookups, adding
support for per-model custom lookups looks like a healthy addition. I feel it
could be moderately easy to implement and could be accomplished with the
time period.

The syntax would be as follows (same for unregistering as well)-

Model - Employee, Lookup - AbsoluteLessThan, Field - IntegerField

Employee.register_lookup(IntegerField, AbosoluteLessThan)

Same for unregistering as well.

15
]\

5.8.1 Per-model lookups - register_lookup overview

This method would be a register_lookup inside django.db.models in base.py


file inside the Model class.

register_lookup would take two arguments.

● Field - could be the Field class of its child classes (CharField,


IntegerField, etc).
● Lookup - the lookup that is to be registered.

A similar implementation could be done for unregistering models lookups as


well.

5.8.2 Model.register_lookup working

My idea goes like this -

● Model.register_lookup would loop through the Field instances of the


Model.
● It would execute register_lookup for each instance that matches the
Field given as an argument.
The same procedure could be followed for unregistering lookups for models
as well.

Mock implementation -

@classmethod
def register_lookup(cls, field, lookup):
fields = cls.fields # just for demonstration
for field in fields:
field.register_lookup(lookup)

16
]\

5.9 Additional Features (optional)


If time permits -

● Adding support to register or unregister multiple (list of) lookups at


once rather than doing it one by one.
● Adding support for per-instance field transform as well (would be very
similar).
● Any other idea that might pop up for me or my mentor during the
course of this project.

17
]\

6. Timeline
Since the project is partially done in #11408 and charettes@ccc58bf, I
think it would likely be a moderately long project. Already implemented
methods would only need to be improved and optimized and further tested
and documented (which in my opinion would not take a lot of time). New
methods would take some time to implement from scratch.

Even though some work is already done, completing the project in 175
hours might not be possible if any problem arises during the course of doing
it. To make sure to give the best possible output, a 350-hour project would
provide enough time to overcome any problems which would arise and
would allow me to add additional features to this project.

So considering this as a 350-hour project (as mentioned in StartingGSOC),


here’s the timeline -

6.1 Phase 1 - June 13 to July 25 (6 weeks)


Week 1 - Week 2 (June 13 - June 26)

I would be working at half speed due to my university exams starting from


June 21 till June 30(tentative).

● Add a classorinstancemethod class for differentiating instance calls


from class calls (returns partial functions).
● Check all possible edge cases of the above class.
● Add tests for the implementation

18
]\

Week 3 - Week 4 (June 27 - July 10)

I would be working at full speed because my examinations would be over and


I will have a month-long (entire July) holiday. I could work up to 6-8 hours a
day.

● Implement register_instance_lookup to register lookup to field


instances.
● Discuss and implement how to handle possible errors that could occur.
● Discuss and document the method
● Write tests for possible cases.

Week 5 - Week 6 (July 17 - July 24)

● Implement unregister_instance_lookup to allow unregistration of


unwanted lookup on field instances.
● Handling errors in different cases and documentation.
● Writing tests for all possible cases.

6.2 Phase 2 - July 25 to September 12(tentative)

GSOC this year would be flexible with time, so the second phase time period
would differ for different projects.

Week 7 (July 25 - July 31)

● Implement get_lookup, get_lookups, register_lookup and


unregister_lookup partial functions
● Write tests and document them.

19
]\

Week 8 - Week 9 (August 1 - August 14)

● Implementing get_instance_lookups which gets all lookup for the


instance and get_instance_lookup which gets the lookup named
lookup_name.
● Writing tests for both the methods and documenting them.
● Figuring out edge cases and handling errors.

Week 10 - Week 13 (August 15 - September 12)

Model.register_lookup() method would take some time to implement. My


university classes would start at this point and I could be able to work 3-5
hours a day.

● Figuring out and optimising my approach to implement both methods.


● Create the register_lookup and unregister_lookup methods in the
Model class.
● Write tests and document both methods.
● Handle errors and edge cases.
● Complete any incomplete work if implemented on time

If I could finish by this time I will submit the project within the standard
coding period. If in any case that is not possible, I will complete the remaining
work during the extended coding period which goes on till November 13.
During the remaining time, I could work on additional features if anything
looks promising.

20
]\

6.3 Pre-GSOC and community bonding period


I would work on a few more tickets and get to know about Django and its
various components. I could discuss with other Django contributors and
further improve my approach to this project and improve my knowledge of
the codebase. I would try to interact with developers in the
django-developers-mailing-list as well as take feedback from my mentors.

7. Previous work
I am a beginner to open source, so I have not done a huge deal of
contributions before. But I have worked on two merged pull requests in
Django.

Pull Requests(merged)

● Fixed #33216 - Simplified deconstructed path for some expressions


● Refs #33216 - Made deconstructible to avoid changing path for
subclasses

So I am already comfortable with the contribution practices and coding style.

I have also made a few projects with Django (a blog application) and also
have built web applications using HTML, CSS, Javascript and Node.js.

21
]\

About Me
My name is Allen Jonathan David and I am a student at Vellore Institute of
Technology University in Vellore, India. My time zone is UTC+05:30. I have been
coding in python for two years now and can also code in Java and Javascript. I
have contributed a couple of pull requests to Django already and am
confident I can take on this task.

You can contact me via my email - [email protected]

Timeline - UTC+05:30
Email - [email protected]
Github - AllenJonathan
Phone - 91-8778994400

22

You might also like