Caching records

By default, Pyrtable does not use any caching mechanism. In other words, any query operation will hit the Airtable server to fetch fresh data. In the example below the server will be queried twice:

class EmployeeRecord(BaseRecord):
    class Meta:
        # Meta data

    name = StringField('Name')

if __name__ == '__main__':
    # At this point there is no communication with the server --
    # query is being built but not iterated over:
    employees_query = EmployeeRecord.objects.all()

    # Now data will be fetched from the server:
    for employee in employees_query:
        print(employee.name)

    # Since no caching mechanism is active,
    # data will be fetched *again* from the server:
    for employee in employees_query:
        print(employee.name)

This can be reduced to a single server hit by storing the query results beforehand:

class EmployeeRecord(BaseRecord):
    class Meta:
        # Meta data

    name = StringField('Name')

if __name__ == '__main__':
    # Notice that the query results (not the query itself)
    # is now being stored, so all server communication happens here
    employees = list(EmployeeRecord.objects.all())

    # This happens without server communication
    for employee in employees:
        print(employee.name)

    # This also happens without server communication
    for employee in employees:
        print(employee.name)

However, in any case operations can be extremely slow when referring to linked records, i.e., those contained in SingleRecordLinkField and MultipleRecordLinkField fields. In these cases, the default strategy is to make a new server request for each linked record. If you are working on a table with several linked records this will obviously become a waste of resources, especially if some records are linked many times!

To overcome this, Pyrtable offers a caching strategy for linked records. Instead of loading them one by one, you can first fetch all records from the linked table(s), then work normally over the “linking” table. The following example illustrates this strategy:

# This table contains the records that will be referred to
# in another table
class EmployeeRecord(BaseRecord):
    class Meta:
        # Meta data

    name = StringField('Name')

# This table has a field that links to the first one
class ProjectRecord(BaseRecord):
    class Meta:
        # Meta data

    team = MultipleRecordLinkField('Team Members',
                                   linked_class='EmployeeRecord')

if __name__ == '__main__':
    from pyrtable.context import set_default_context, SimpleCachingContext

    # Set up the caching mechanism
    caching_context = SimpleCachingContext()
    set_default_context(caching_context)

    # Fetch and cache all records from the Employee table
    caching_context.pre_cache(EmployeeRecord)

    # From now on references to ``.team`` field
    # will not require server requests
    for project in ProjectRecord.objects.all():
        print(project.name,
              ', '.join(employee.name for employee in project.team))

When caching will happen?

Besides calling caching_context.pre_cache(RecordClass), this mechanism will also cache any record that is fetched from the server. So, after using set_default_context(SimpleCachingContext()) any linked records will be fetched only once.

Note

If you read the source code you will notice that calling caching_context.pre_cache(EmployeeRecord) is the same as simply fetching all table records (as they will be cached). In other words, this call is equivalent to list(EmployeeRecord.objects.all()).

Controlling which tables are cached

Caching all tables may be too much depending on your scenario. This default behaviour can be tuned using constructor arguments for the SimpleCachingContext class:

class SimpleCachingContext(allow_classes=None, exclude_classes=None)

allow_classes, if specified, is a list of record classes that will always be cached. Any classes not listed will not be cached.

exclude_classes, on the other hand, is a list of record classes that will never be cached. Any classes not listed will be cached.

The CachingContext.pre_cache() method

This method can actually receive several arguments. Each argument specifies what is to be cached:

  • If the argument is a subclass of BaseRecord, then all records will be fetched (by calling .objects.all()) and cached.
  • If the argument is a query (e.g., MyTableRecord.objects.filter(…)), then the records will be fetched and cached.
  • If the argument is a single record object (with a non-null .id), then this record will be stored in the cache without being fetched from the server.