Skip to main content
Premier resource for practicing structural engineers
Go back to https://www.structuremag.org/articles Back
Software

Thought Experiments for Understanding the Legality of Machine Learning in Structural Engineering

By M.Z. Naser, PhD, PE
March 1, 2026

To view the figures and tables associated with this article, please refer to the flipbook above.

Machine learning does not alter the fundamental obligations of structural engineers but rather changes how professional judgment can be documented and made legible to reviewers. This distinction matters because instead of treating machine learning as a special technology requiring new regulations, we seek to understand it as a documentation and governance challenge within existing professional frameworks.

Three controlled thought experiments are described here to isolate specific aspects of machine learning integration. Each experiment examines two scenarios that differ in one dimension of professional practice while maintaining identical physical designs, safety factors, code provisions, and commissioning protocols. This controlled comparison methodology enables precise identification of how documentation, disclosure, and governance choices affect legal responsibility independent of technical engineering competence.

Thought Experiment 1: Standard of Care and Professional Services vs Product Liability

This experiment examines how the type of prepared documentation determines whether machine learning-assisted engineering work falls under professional negligence standards or product liability doctrines. Following are two scenarios involving identical steel connection designs for a mid-rise building where machine learning algorithms were used to predict the connection-level demand envelopes across standard load cases.

Scenario A: Service Posture

In the first scenario, the project documentation comprehensively demonstrates that machine learning algorithms function as analytical instruments under direct engineering control. For example, the calculation package begins with a clear statement of methodology that identifies the machine learning algorithm as a preliminary sizing tool. The corresponding engineer provides complete hand calculations showing agreement within acceptable tolerances for connection moments and shears that can be verified through closed-form solutions. In addition, the engineer establishes conservative upper and lower bounds using simplified methods explicitly referenced to relevant code provisions for more complex interaction effects where closed-form solutions are impractical.

The provided documentation also meticulously traces the reasoning for every engineering override of machine learning recommendations. For instance, the file identifies regions where training data exhibits lower density, particularly for unusual live-load patterns. For these cases, the engineer applies additional safety factors and documents the rationale with quantitative justification. Further, the machine learning model’s applicability is bounded by steel grades between A36 and A992, connection configurations limited to standard AISC prequalified details, and lateral systems consisting of either moment or braced frames, but not dual systems. The calculation package demonstrates that any competent structural engineer could reproduce the entire design process from first principles without access to the machine learning algorithm, using only the documented assumptions, code references, and engineering rationale provided.

Scenario B: Product Posture

The second scenario presents identical connection designs but with fundamentally different documentation. The calculation package shows that connection sizes and configurations were obtained from a vendor’s machine learning web portal that accepts building geometry and loading conditions as inputs and produces connection schedules as outputs. The engineer’s role appears limited to transcribing these outputs into construction documents with minimal independent verification.

In this scenario, the documentation includes screenshots from the vendor portal showing input parameters and resulting connection designs and a handful of spot checks for randomly selected connections. While these checks show general agreement, they do not establish systematic verification or conservative bounding. The vendor agreement is attached to the project file, containing extensive limitation of liability clauses and disclaimers about fitness for particular purposes. The calculation package refers to the vendor’s claimed 95% reliability rate and machine learning validation studies, but does not demonstrate independent engineering judgment in translating these claims into project-specific safety margins. The overall documentation suggests that the vendor portal functioned as a source of engineering deliverables rather than a tool under engineering control.

Implications and Classification

The distinction between these documentation postures carries implications for legal liability. For example, under the service posture demonstrated in Scenario A, disputes would be evaluated under professional negligence standards. Courts would ask whether the engineer met the standard of care typical of competent professionals in similar circumstances. The engineer’s liability insurance would respond to claims, and damages would typically be limited to economic losses directly caused by any proven negligence. The vendor’s role would remain contractual, with liability limited by the terms of the software license agreement.

On the other hand, under the product posture shown in Scenario B, strict liability theories become available to claimants. The design could be characterized as a defective product regardless of the engineer’s diligence in following vendor instructions. It is possible for “plan stamping” allegations to gain credibility because the documentation suggests the engineer merely authenticated vendor deliverables rather than exercising independent professional judgment. Product liability insurance, if available, carries different terms and exclusions than professional liability coverage. In fact, the vendor might be joined as a co-defendant under theories that both parties participated in delivering a defective product to the market.

As one can see, a classification rule emerging from this comparison can be straightforward and practical to use. Here, service posture exists when documentation demonstrates that machine learning outputs underwent systematic verification or conservative bounding, engineering overrides are justified with technical rationale, validity limits are explicitly stated and verified, and the design process can be reproduced from documented assumptions and code references. In contrast, product posture exists when documentation shows acceptance of vendor outputs with limited verification, absence of systematic conservative translation into code checks, minimal engineering rationale for design decisions, and dependence on vendor deliverables for critical design parameters.

Thought Experiment 2: Materiality and Disclosure Obligations

This experiment investigates whether engineers have a duty to disclose machine learning usage, its limitations, and associated monitoring requirements to project stakeholders through two scenarios.

Scenario A: Full Disclosure

The first scenario implements complete disclosure beginning with the proposal phase. The professional services agreement includes a section on “Computational Methods and Limitations” that explains in plain language that preliminary member sizing will employ machine learning algorithms trained on a database of previous projects. This section also specifies that these algorithms excel at routine configurations but may be less reliable for irregular geometries, unusual loading patterns, or innovative structural systems. The document commits to conservative verification of all machine learning outputs through conventional engineering analysis before finalizing design decisions. The submittal package to the authority having jurisdiction includes a technical memorandum that details the machine learning methodology without requiring reviewers to understand algorithm internals. The document explicitly identifies scenarios where machine learning recommendations were overridden, such as connections near building corners where stress concentrations exceed typical patterns in the training data. Finally, operational triggers are clearly specified in both owner and authority documentation.

Scenario B: Method Opacity

The second scenario produces identical structural designs and safety margins but omits machine learning methodology from all external communications. The professional services agreement uses standard language about employing “current best practices” and “advanced analysis methods” without specificity. The calculation package, while thoroughly demonstrating code compliance, presents final results without describing how preliminary sizes were determined. References to “computerized analysis” and “optimized design procedures” appear occasionally, but without detail about machine learning involvement. The submittal to authorities presents conventional analysis results that verify code compliance for the final design. The reviewing engineer finds no technical deficiencies because the final design is indeed adequate. However, the submittal contains no information about the preliminary sizing methodology, its limitations, or conditions that might require heightened scrutiny during future modifications. Therefore, future engineers examining the structure would see no indication that certain design aspects might be sensitive to conditions outside the machine learning training domain.

Materiality Assessment Through Operational Events

Consider a plausible operational scenario occurring eighteen months after occupancy. The owner converts a portion of the structure for high-density storage. Simultaneously, a moderate wind event causes observable but non-threatening building movement that concerns tenants. Post-event inspection reveals partition cracking at several locations where the load increase coincided with drift-sensitive architectural details. The structure remains safe with ample reserve capacity, but the owner initiates a dispute claiming that undisclosed methodological limitations prevented informed decisions about enhanced monitoring or preliminary strengthening.
In Scenario A, the documented disclosure provides clear evidence that stakeholders were informed about methodology and limitations. The authority can point to their files showing that approval was granted with full knowledge of the computational methods employed. While the partition damage remains unfortunate, the dispute centers on whether the disclosed limitations were adequately conservative rather than whether material information was withheld. On the other hand, in Scenario B, the absence of disclosure creates ambiguity about what stakeholders could reasonably have been expected to know. The owner could argue that knowledge of machine learning involvement would have prompted different decisions about monitoring systems or load restrictions during the design phase. The authority may likely question whether their review would have required additional verification had they known about methodological limitations. The engineer’s position that the final design met all codes becomes less compelling when stakeholders demonstrate that material information affecting their risk assessment was not provided.

The materiality standard emerging from this comparison follows a reasonable decision counterfactual test. Information is material if a reasonable owner or authority would modify their decisions after learning about it. This test does not require proving that different decisions would definitely have been made, only that the information could reasonably affect the decision making process. Under this standard, machine learning methodology and limitations are material because they affect risk assessment, monitoring decisions, future modification planning, insurance coverage determinations, and due diligence for property transactions.

Thought Experiment 3: Operational Governance and Threshold Modification

This experiment examines governance requirements when machine learning-based monitoring systems undergo threshold adjustments during building operations. Two scenarios are examined involving identical flat-plate office buildings with sensor networks generating daily structural health indices that estimate punching shear risk at column locations.

Scenario A: Governed Threshold Adjustment

The first scenario begins with a monitoring protocol documented in a controlled charter. The initial system establishes an evacuation threshold when the daily health index exceeds 0.85 (a hypothetical index presented for the sake of discussion), which indicates possible punching shear distress requiring immediate action. Additionally, the charter defines special conditions, including events with more than 300 attendees, installation of heavy equipment, or any activity that concentrates loads near column lines. When these special conditions coincide with health indices above 0.65 but below the evacuation threshold, the protocol requires consultation with the structural engineer of record.

After six months of operation, weekend furniture relocations begin triggering nuisance alerts when indices reach 0.66 to 0.68, thereby prompting maintenance staff complaints about false alarms. The owner requests a threshold adjustment to reduce disruptions without compromising safety. The engineer analyzes three months of sensor data and confirms that furniture moves create brief spikes that dissipate within hours without cumulative effects. Based on this analysis, the engineer raises the consultation threshold from 0.65 to 0.70 for single-day excursions while maintaining the original 0.65 threshold for any patterns persisting beyond 24 hours. This modification is documented through a formal revision to the monitoring charter. The change includes the engineer’s technical rationale with supporting data analysis, revised threshold values with temporal qualifications, maintained requirements for special condition consultation, commitment to quarterly review of sensor data for the first year after modification, and explicit owner acknowledgment of residual risks.

Scenario B: Ungoverned Threshold Modification

The owner requests relief from false alarms, and the engineer agrees to adjust the thresholds based on operational experience. The modification is communicated through an email stating that the consultation threshold is increased to 0.70, with the evacuation threshold remaining at 0.85 for safety. The email communication reaches building management and maintenance supervisors but does not follow formal documentation protocols. Therefore, the operations staff interpret the email as establishing a simple new rule where action is only required when indices exceed 0.70, without distinguishing between transient spikes and persistent patterns or considering concurrent special conditions.

Critical Event and Governance Assessment

Four months after the threshold modification, a technology conference installs demonstration equipment, including industrial displays, in a corner section of the floor plate. The concentrated load from equipment and attendees causes the health index to rise to 0.68 in affected column zones. The equipment is removed after eight hours when the conference concludes, and indices return to baseline by the following morning.

In Scenario A, operations staff recognize the conference as a special condition defined in the charter. Despite the index remaining below the modified 0.70 threshold, they contact the engineer per protocol. The engineer reviews real-time sensor data, recommends temporary load redistribution for remaining equipment, and schedules an inspection for the following morning. Documentation shows that governance protocols functioned as designed, with special conditions triggering appropriate technical review regardless of threshold values. In contrast, in Scenario B, operations staff observe the 0.68 index but take no action because it remains below the communicated 0.70 threshold. The absence of special condition requirements in the threshold modification email leads staff to believe consultation is unnecessary.

The distinction between governed and ungoverned threshold modifications becomes clear through this comparison. Governed modifications require formal documentation in controlled protocols with revision tracking, explicit technical rationale based on quantitative analysis, temporal qualifications distinguishing transient from persistent conditions, maintained or enhanced safeguards for special conditions, defined review periods for modified thresholds, and acknowledged owner acceptance of residual risks.

Ungoverned modifications lack these documentary safeguards and create ambiguity about applicable thresholds, confusion regarding special condition responses, absence of technical rationale for reviewers, no mechanism for knowledge transfer to new staff, inability to reconstruct decision-making during disputes, and potential degradation of safety through informal threshold creep. The governance failure is not in the threshold modification itself, which may be technically justified, but in the absence of documentation that ensures consistent implementation and preserved safeguards.

Conclusion

Machine learning integration in structural engineering preserves fundamental professional obligations while requiring enhanced documentation of engineering judgment. Responsibility allocation depends not on the presence of machine learning algorithms but on the visibility of professional control through documentation, disclosure, and governance. These findings do not require new regulations or special treatment of machine learning technology. Instead, they clarify how existing professional obligations propagate when machine learning tools contribute to engineering decisions (in a way, similar to how FE models were treated when first introduced). Fortunately, standard templates for calculations, proposals, and operational protocols can incorporate the documentation with minimal overhead. The visibility of engineering judgment through proper documentation, rather than the computational methods themselves, determines whether machine learning integration meets professional obligations in structural engineering practice. ■

About the Author

M.Z. Naser, PhD, PE, is a tenure-track assistant professor at the School of Civil and Environmental Engineering and Earth Sciences & a member of the Artificial Intelligence Research Institute for Science and Engineering (AIRISE) at Clemson University.

References

D.N.D. Hartford, Legal framework considerations in the development of risk acceptance criteria, Structural Safety. (2009). https://doi.org/10.1016/j.strusafe.2008.06.011.

X. Ren, K.C. Terwel, P.H.A.J.M. van Gelder, Human and organizational factors influencing structural safety: A review, Structural Safety. (2024). https://doi.org/10.1016/j.strusafe.2023.102407.

J. Ittmann, A. Okeil, C.J. Friedland, Standard of Care for the Practicing Structural Engineer, Journal of Legal Affairs and Dispute Resolution in Engineering and Construction. (2018). https://doi.org/10.1061/(asce)la.1943-4170.0000265.

H. Furey, S. Hill, S.K. Bhatia, Beyond the code: a philosophical guide to engineering ethics, Taylor and Francis, 2021. https://doi.org/10.4324/9781315643816/BEYOND-CODE-HEIDI-FUREY-SUJATA-BHATIA-SCOTT-HILL.

N.P. Høj, I.B. Kroon, J.S. Nielsen, M. Schubert, System risk modelling and decision-making – Reflections and common pitfalls, Structural Safety. 113 (2025) pp. 102469. https://doi.org/10.1016/J.STRUSAFE.2024.102469.

NIST, AI Standards | NIST, 2024. https://doi.org/10.6028/NIST.AI.100-5.