Inventorying data: a new approach

ArcPy list functions give you the options to list out a particular data type for a given workspace, but expanding that out to a directory tree meant cobbling together those list functions with Python’s os.walk and lots of updates to the workspace environment. It can be done (as shown here), but in my experience it is a process which is easy to get wrong.

Python’s os.walk is noteworthy and useful, because it does all of this, but is limited to file types. It can’t peer into a geodatabase to identify feature classes for example.

At 10.1 service pack 1, we added arcpy.da.Walk, Walk takes care of all that workspace handling for you and mimics os.walk in arguments and behaviors.

The below code wraps arcpy.da.Walk in a generator function to return a full path to all appropriate datatypes under a given workspace.

import os
import arcpy

def inventory_data(workspace, datatypes):
    """
    Generates full path names under a catalog tree for all requested
    datatype(s).

    Parameters:
    workspace: string
        The top-level workspace that will be used.
    datatypes: string | list | tuple
        Keyword(s) representing the desired datatypes. A single
        datatype can be expressed as a string, otherwise use
        a list or tuple. See arcpy.da.Walk documentation 
        for a full list.
    """
    for path, path_names, data_names in arcpy.da.Walk(
            workspace, datatype=datatypes):
        for data_name in data_names:
            yield os.path.join(path, data_name)


for feature_class in inventory_data(r"c:\data", "FeatureClass"):
    do_something(feature_class)

Advertisements