Adding Fields: Performance Tips

Two approaches to help increase performance when adding numerous fields to a table or feature class.

1. Always load or create the table or feature class in-memory:

import arcpy
# Add fields to a feature class loaded into memory.
fc = r'c:\data\boston.gdb\parcels'
arcpy.management.MakeFeatureLayer(fc, 'parcels_lyr')
for f in out_fields:
    arcpy.AddField_management('parcels_layer',
                              f.name,
                              field_type=f.type,
                              field_length=f.length,
                              field_alias=f.aliasName)

 

import os
import arcpy

# Add the fields to an in_memory table.
tmp_table = os.path.join('in_memory', 'table_template')
arcpy.management.CreateTable(*os.path.split(tmp_table))
for f in out_fields:
    arcpy.AddField_management(tmp_table,
                              f.name,
                              field_type=f.type,
                              field_length=f.length,
                              field_alias=f.aliasName)

# Create the actual output table.
arcpy.CreateTable_management(out_path,
                             out_table_name,
                             template=tmp_table)
arcpy.Delete_management(tmp_table)

2. Use the data access and NumPy modules. The data access module function named ExtendTable() joins the contents of a NumPy structured array to a table based on a common attribute field.
This is the faster approach, however, the types of fields you can add using numpy are limited. There is no support for adding blobs, raster, and date fields.  In addition, the field alias can not be defined or altered.

import arcpy
import numpy

fc = r"c:\data\water.gdb\wells"

narray = numpy.array([],
numpy.dtype([('_ID', numpy.int),
             ('WELL_ID', numpy.int),
             ('DESC', '|S100'),
             ('DEPTH', numpy.float),
             ]))

arcpy.da.ExtendTable(fc, "OID@", narray, "_ID")

9 thoughts on “Adding Fields: Performance Tips

    • Hi Stefan,

      This is a feature.

      Tools can be accessed directly from arcpy:
      arcpy.MakeFeatureLayer_management(…)

      Or, tools can be accessed from arcpy ‘toolbox’ modules. Each Geoprocessing toolbox has a corresponding module in arcpy.
      So you can access tools as:
      arcpy.management.MakeFeatureLayer(…)
      arcpy.analysis.Buffer(…)

      It’s a matter of preference. Functionally, they are no different. I prefer the later, other team members prefer the first way.

      Hope this helps.

  1. Hello. Thank you for the tip. And how about DELETE a field and CHANGE a field’s data type. I have to rename the old class, create a new class, load in data from the old class, drop the old class. To-do list is much longer if counting on privilege, class extension, domains, relationship, etc.

  2. I did some performance testing of your second method at http://gis.stackexchange.com/questions/99792/creating-numpy-array-with-variable-number-of-fields-to-test-arcpy-da-extendtable and found that the maximum benefit to using the Data Access and NumPy modules, over multiple uses of Add Field, to add additional fields is when there are lots of fields and fewer features. If you have only a few fields to add, and lots of features then using few Add Fields is quicker.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s