How To Structure Field Data for Clean GIS Imports

Most utility data quality problems do not start in the office. They start the moment a field crew opens a blank form, types “PVC pipe” into a freetext field and moves on to the next asset. By the time that dataset reaches a GIS technician, it contains a dozen variations of the same value, attribute fields that do not match the target schema and at least one coordinate reference system mismatch that will cause the import to fail silently. The data looks complete but is not usable.

Why Utility Datasets Fail Before Anyone Opens the GIS

Field data fails in the GIS not because GIS technicians make mistakes during import, but because the data arrives in a format the GIS environment was never designed to accept. The failure is baked in at capture.

The typical chain looks like this: a survey crew completes a job and hands over a dataset. The GIS technician opens it and finds field names that do not match the geodatabase schema, material values recorded as freetext, several required attributes left blank and geometry that mixes point and line features in the same layer. None of this is fixable by running an import tool. All of it requires manual intervention before a single feature can enter the asset register.

That manual intervention takes time, sometimes hours per dataset, sometimes days across a programme of works. It delays handover to the asset owner. It introduces new errors when a technician has to interpret what a field crew member meant when they typed “old cast iron, maybe 6 inch” into a diameter field. And it repeats. The same crew, the same jobs, the same problems, month after month, because nothing about the capture process has changed.

Over years, this accumulation of inconsistent records makes the asset register itself unreliable. A GIS query for all asbestos cement mains in a pressure zone returns a partial result because the same material is recorded as “AC,” “asbestos cement,” “AC pipe,” “grey cement” and “unknown” across different jobs captured by different crews at different times. The data exists. It cannot be trusted.

What “Bad Data” Actually Looks Like

Bad field data is not usually the result of a crew being careless. It is the predictable result of asking people to capture structured information using an unstructured tool. When a field form has no constraints on what can be entered, variation is inevitable.

The following table shows the most common failure modes in utility field data, what they look like in practice and what they cost the GIS team downstream.

Failure ModeWhat It Looks Like in the DataDownstream Consequence
Inconsistent field naming“pipe_material” in one job, “MATERIAL” in another, “Mat” in a thirdSchema mapping fails; fields cannot be merged across jobs
Freetext where coded values are required“PVC,” “pvc pipe,” “PVC-U,” “plastic,” “grey PVC” all in the same datasetGIS queries return incomplete or incorrect results
Missing required attributesDiameter field blank on 30% of pipe recordsDataset cannot be imported into a schema with required fields
Coordinate reference system mismatchData captured in a local grid, imported into a national coordinate systemFeatures appear in the wrong location or off the map entirely
Mixed geometry types in one layerPoint features and line features in the same feature classLayer cannot be styled, queried or analysed correctly
Date format inconsistency“12/03/2024,” “March 2024,” “unknown” and “03-12-24” in the same date fieldDate-based queries and asset age calculations fail

Each of these problems has a structural cause. None of them require the crew to make a technical error. They require the crew to be given no structure at all.

The Problem with Freetext Fields

Freetext entry in attribute fields is the most common single source of data quality failure in utility surveys. When a crew member can type anything into a field, they will — and no two people will type the same thing the same way.

Consider a pipe material field on a 500-record water main dataset. Across 12 months of work, captured by three different crews, the same field contains: “PVC,” “pvc,” “PVC pipe,” “PVC-U,” “uPVC,” “plastic,” “grey plastic,” “poly,” “polyethylene” and “unknown.” Every one of those entries refers to a physical pipe material. None of them match the coded value that the GIS schema expects, which is “PVC-U.”

When the GIS technician runs a query for all PVC-U mains to plan a replacement programme, the result is wrong. Not slightly wrong but wrong by whatever proportion of the dataset used a variation that the query did not match. The asset owner makes a capital planning decision on incomplete information, and nobody knows it.

What the Crew TypedWhat the Schema Expects
PVC pipePVC-U
pvcPVC-U
grey plasticPVC-U
polyPE (polyethylene)
unknown(required — cannot be blank)

The fix is not to train crews to spell material codes correctly. The fix is to remove the freetext field and replace it with a dropdown list of approved values. The crew selects from the list so that variation is impossible.

What GIS Teams Actually Need From Field Data

A GIS technician needs one thing from a field dataset: the ability to import it without touching it first. That requires the field data to arrive in a specific condition, and most of it does not.

The communication gap between field crews and GIS teams is the underlying problem. Field crews do not know what the GIS environment expects because no one has given them that information in a form they can act on. GIS technicians know exactly what they need but have no mechanism to enforce it at the point of capture. The result is a perpetual cycle of bad data in, rework out.

Most generic mobile data collection tools (e.g., Survey123, Fulcrum and Kobo among them) are designed around flexibility. A form can be built quickly, fields can be adjusted between jobs and the tool will accept whatever the crew enters. That flexibility is useful for rapid deployment, but it produces a predictable outcome: data that varies in structure across crews, across jobs and across time. Fields get added informally. Attribute values drift. Relationships between assets go unrecorded. The GIS team receives a dataset that is technically complete and practically unusable.

Geolantis takes a different approach. Data collection in Geolantis is not a form exercise but it’s an extension of a governed asset data model. Each asset type has a controlled schema defined before the job begins. Relationships between assets are explicitly modeled and enforced in the platform, so a valve captured on site is linked to its parent pipe, not recorded as a standalone point that a technician has to connect manually later. Validation is built into the capture process: required attributes cannot be skipped, categorical fields accept only approved coded values and the coordinate reference system is fixed at the project level. There are no ad hoc fields added in the field, no freetext where a controlled list should apply and no optional structure that becomes inconsistent across a programme.

A data schema is the foundation of this approach. In plain terms, a data schema is an agreed list of fields, their names, the type of data each field holds (a number, a date, a text value from a controlled list) and which fields are required. When a field crew captures data against a schema, and the GIS environment is built to the same schema, the import works without intervention. Geolantis makes that schema the capture tool itself, not a document the crew is expected to follow separately.

A clean field data package, ready for GIS import, includes all of the following:

  • Field names that match the target database schema exactly, including capitalisation and underscore conventions
  • Attribute values drawn from the approved coded value list for each field, with no freetext variation
  • No blank required fields: every mandatory attribute populated for every feature
  • A defined coordinate reference system, consistent across all features in the dataset
  • Geometry types that match the target layer like points for point assets, lines for pipe runs, with no mixing
  • A single feature type per layer. Pipes in one layer, valves in another, meters in a third
  • A survey date and accuracy value recorded against each feature

When field data arrives meeting all of these conditions, the import is a single step. When it does not, the import is a project.

Layers, Attributes and Why They Need to Stay Separate

A layer is a collection of the same type of spatial feature, where every feature in the collection shares the same attribute structure. All water mains go in the pipes layer. All isolation valves go in the valves layer. All fire hydrants go in the hydrants layer. The rule is simple, and it breaks constantly.

When a crew captures a valve at the same time as a pipe run, using the same form, both features end up in the same layer if the capture tool does not enforce separation. The layer now contains two geometry types (points and lines) with conflicting attribute structures. The GIS technician cannot import it into a typed feature class without manually splitting it first.

Attributes are the fields that describe each feature such as pipe material, nominal diameter, depth to crown, installation date, pressure zone, condition grade. The attributes required for a pipe are different from the attributes required for a valve, which are different again from those required for a hydrant. Each feature type needs its own form, its own layer and its own attribute set.

Feature TypeLayer NameCore Attributes
Water mainpipes_watermaterial, nominal_diameter, depth_crown, pressure_zone, install_date, condition_grade
Isolation valvevalves_isolationvalve_type, size, operating_status, depth_top, material
Fire hydranthydrants_firehydrant_type, size, operating_pressure, cover_type, install_date
Water metermeters_watermeter_size, meter_type, connection_diameter, service_address
Air valvevalves_airvalve_type, size, depth, install_date

When every feature type has a defined layer and a defined attribute set, and those definitions are built into the capture tool, the crew cannot mix geometry types or miss required fields. The structure enforces itself.

The Real Cost of Rework

Data rework is not a minor inconvenience in utility programmes. It is a recurring operational cost that scales with every job that produces unstructured data.

A single mid-size water main survey (200 to 300 features) with the typical range of freetext values, inconsistent field naming and missing attributes takes an experienced GIS technician between three and six hours to clean before it can be imported. That is not a one-time cost. Across a programme of 20 surveys per year, that is between 60 and 120 hours of skilled technical time spent on work that adds no new information to the asset register. It reformats information that already exists.

The secondary costs compound this. Delayed handover means the asset owner cannot update their records on the programme schedule. Incomplete records that do enter the asset register carry errors that are expensive to identify and correct later. And when a dataset is returned to the contractor for rework, the crew that captured it has often moved to the next job meaning that context is lost, decisions are guessed and the corrected data is less reliable than data that was right the first time.

The comparison is direct: configuring a field form to enforce correct data structure takes a few hours once. Cleaning the output of an unconfigured form takes hours every time a job is submitted. The cost of not fixing the problem at the source accumulates for as long as the programme runs.

How Correct Data Structure Gets Built Into the Field Workflow

Data quality at the field level is not a training problem. It is a tool design problem. A crew given an unstructured form will produce unstructured data. A crew given a form built to the target GIS schema will produce data that imports cleanly, regardless of their GIS knowledge.

Geolantis builds the data schema into the capture form. When a project is configured in Geolantis, the data manager or GIS technician defines the feature types required for the job, the attributes required for each type, the coded value lists for each attribute field and which fields are mandatory. That configuration becomes the form the crew uses on site.

The crew does not see a schema document. They see a form with labelled fields, dropdown lists and required field indicators. They select a feature type, fill in the fields and move to the next asset. The form will not save a record with a required field blank. The material field offers a dropdown list, not a text box. The coordinate reference system is set at the project level and applies to every captured point automatically.

The following table shows the difference between data captured with an unstructured tool and data captured through a Geolantis-configured form:

Data ConditionUnstructured CaptureGeolantis-Structured Capture
Field namingVaries by crew and jobFixed to the target schema at project setup
Material valuesFreetext — unlimited variationCoded value dropdown — only approved values selectable
Required attributesOptional in practiceEnforced — form will not save without them
Coordinate reference systemSet by device default or crew preferenceDefined at project level, consistent across all features
Geometry type separationMixed if crew uses one form for all featuresEnforced by feature type selection
Import readinessRequires cleaning before importImports directly into the target GIS schema

The configuration is done once per project type, not once per job. A standard water main survey form, built to the asset owner’s schema, deploys to every job of that type across the programme. The crew captures the same structure every time, on every job, regardless of which crew member is in the field that day.

Offline Capture and Why It Does Not Change the Structure

A common assumption about offline field capture is that it produces less reliable data than capture with a live connection, that constraints do not apply, required fields can be skipped or the form behaves differently away from the network. That assumption is incorrect.

Geolantis maintains the full data schema and form configuration in offline mode. The crew captures the same fields, against the same coded value lists, with the same required field enforcement, whether the tablet has a 5G connection or no signal at all. The schema is stored on the device at job download, not retrieved from the network during capture.

When the tablet reconnects, the data syncs to the platform automatically. No manual upload step is required. No data captured offline is treated differently from data captured online. The import-ready structure of the data is not a function of connectivity, it is a function of how the form is configured.

What Happens Between the Field and the GIS

Data captured in Geolantis follows a defined path from the field tablet to the GIS environment: capture on site, sync to the platform, review by the project team, export in the target format and import into the GIS. Each step is designed to remove manual intervention, not introduce it.

Geolantis supports two integration paths, and both preserve the structured data the form produces..

For organisations using QGIS or other open-source GIS environments, Geolantis can export data to standard formats such as GeoJSON and CSV. Because data capture is governed by a predefined schema with controlled field names, attribute structures, and geometry types, the resulting datasets are typically well structured and import cleanly into most GIS platforms that support these formats, often requiring only minimal transformation.

The integration path is a deployment decision, not a data quality variable. Regardless of which GIS environment receives the data, the structure of what it receives is defined before the crew sets foot on site.

The data path from field to GIS looks like this:

Field capture (Geolantis, on-device) → Sync to platform (automatic on connectivity) → Project team review (flag, annotate or approve records) → Export in target format (direct ArcGIS integration or standard format for QGIS and others) → GIS import (no reformatting required)

Every step in that path is defined before the crew sets foot on site. The GIS technician is not waiting for data to arrive and hoping it is usable. The structure of what they will receive is already agreed.

Building a Data Standard That Crews Can Actually Follow

The single most important factor in sustainable field data quality is not how skilled the field crew is. It is how well-defined the data standard is that they are given to work with. A vague standard produces variable data. A specific, enforced standard produces consistent data, regardless of crew experience or GIS knowledge.

A practical field data standard defines the following for every project type:

  • Every feature type to be captured and the layer it belongs to
  • Every attribute required for each feature type, with the field name exactly as it appears in the target GIS schema
  • The data type for each attribute (text, integer, decimal, date)
  • The coded value list for every categorical attribute field
  • Which attributes are mandatory and which are optional
  • The coordinate reference system to be used on every job
  • The minimum positional accuracy required for each feature type
  • The photo and field note requirements for each feature type

When this standard is defined and built into a Geolantis project configuration, the crew does not need to carry a reference document or remember which values are acceptable for which fields. The form carries the standard. The crew follows the form.

The configuration is reusable. A water main survey standard, once defined and tested in Geolantis, deploys to every water main survey job across the programme. When the asset owner updates their schema requirements, the configuration updates once and the change applies to every subsequent job automatically. The standard stays current without requiring a retraining programme every time something changes.

Geolantis builds your data schema into the capture form, so field crews collect the right data in the right format on every job. See how the full workflow runs from field capture to GIS import.

Request A Live Demo Today

Ready to see how Geolantis can elevate your utility mapping? Fill out the form to schedule a personalized demo.

Our team will walk you through the features and benefits tailored to your needs, helping you unlock the full potential of Geolantis.360.

This field is for validation purposes and should be left unchanged.
Name(Required)
Address(Required)
Disclaimer