teas Q. Xiong Internet-Draft ZTE Corporation Intended status: Standards Track K. Kompella Expires: 31 December 2026 HPE D. King Lancaster University 29 June 2026 HPC/AI Service Intent Model draft-xkk-teas-hpc-service-intent-00 Abstract This document defines a common service intent model for High Performance Computing (HPC) and AI workloads over High Performance Wide Area Networks (HP-WANs). The model allows heterogeneous workload managers and orchestration platforms to express endpoint, communication pattern, timing, performance, data movement, policy, and admission requirements for network services without exposing technology-specific tunnel realization details. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 31 December 2026. Copyright Notice Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights Xiong, et al. Expires 31 December 2026 [Page 1] Internet-Draft HPC/AI service intent June 2026 and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Conventions Used in This Document . . . . . . . . . . . . . . 3 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Model Scope . . . . . . . . . . . . . . . . . . . . . . . . . 4 5. Model Structure . . . . . . . . . . . . . . . . . . . . . . . 4 6. Relationship to Other Models . . . . . . . . . . . . . . . . 5 7. Open Issues and Design Considerations . . . . . . . . . . . . 6 8. YANG Data Model . . . . . . . . . . . . . . . . . . . . . . . 6 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 23 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 12.1. Normative References . . . . . . . . . . . . . . . . . . 23 12.2. Informative References . . . . . . . . . . . . . . . . . 24 Appendix A. Example . . . . . . . . . . . . . . . . . . . . . . 25 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 1. Introduction HPC and AI workloads increasingly depend on coordinated compute, storage, and network resources across data center domains and geographically distributed sites. Workload managers and orchestration systems often know when a workload should run, which systems need to communicate, how much data is expected to move, and what performance characteristics are needed for the workload to complete successfully. The HP-WAN environment, including data-intensive applications, high- throughput transmission, completion-time objectives, admission control, traffic scheduling, and host-network collaboration, is described in [I-D.kcrh-hpwan-state-of-art] and [I-D.xhy-hpwan-framework]. Related work on machine learning cluster scheduling, including [I-D.kompella-rtgwg-mlnwsched], describes environments in which workload timing and network behavior affect job completion time and predictability. This document defines a common way for workload- facing systems to express the desired network outcome without directly configuring network mechanisms. Xiong, et al. Expires 31 December 2026 [Page 2] Internet-Draft HPC/AI service intent June 2026 Existing scheduler and orchestration models are platform-specific and primarily describe compute resources, accelerator resources, job placement, queues, and lifecycle state. They do not provide a common, technology-independent model for expressing the network service intent associated with a scheduled workload. The service intent requirements in this document are informed by the information available from widely deployed workload schedulers and AI orchestration platforms, while the interface is intended for use by data center and inter-data-center network controllers, orchestrators, and brokers. This separation allows workload-facing systems to expose network-relevant intent without becoming responsible for network realization. This document defines the common service intent model. It is intended to consume scheduler and job metadata defined separately in [I-D.xkk-teas-hpc-scheduler-job-metadata] and to provide a northbound service abstraction for network controllers or orchestrators. The mapping from accepted service intent to tunnels, paths, policy, and resource allocation is defined separately by a tunnel realization model. 2. Conventions Used in This Document 2.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 3. Terminology The terms Workload, Job, Task, Scheduler, Scheduler Job Metadata, Service Intent, Tunnel Realization, and Correlation Identifier are defined in [I-D.xkk-teas-hpc-scheduler-job-metadata]. This document uses those terms with the meanings defined there. This document defines following terms: Service Intent Instance: A network service request associated with scheduler job metadata, containing endpoint, performance, and policy requirements. It is identified by an intent identifier and can reference workload, job, and correlation identifiers from the scheduler job metadata model. Admission State: The network controller's response indicating Xiong, et al. Expires 31 December 2026 [Page 3] Internet-Draft HPC/AI service intent June 2026 whether a service intent can be fulfilled, including any modifications or constraints. 4. Model Scope The service intent model expresses what network service is requested by a workload. It includes workload correlation, endpoints, endpoint groups, communication pattern, requested timing, data movement requirements, performance objectives, policy preferences, and admission state. The model intentionally excludes low-level path computation, label programming, tunnel configuration, and technology-specific traffic engineering parameters. Those details are part of network realization. The model is intended to be usable at the boundary between workload management domains and network orchestration domains, including data center and inter-data-center environments. 5. Model Structure module: ietf-hpc-service-intent +--rw hpc-service-intent +--rw intent* [intent-id] +--rw intent-id string +--rw enable? boolean +--rw workload-ref | +--rw workload-id? string | +--rw job-id? string | +--rw correlation-id? string +--rw endpoints | +--rw endpoint-group* [endpoint-group-id] | +--rw endpoint-group-id string | +--rw role? identityref | +--rw endpoint* [endpoint-id] | +--rw endpoint-id string | +--rw address? inet:host | +--rw site-id? string | +--rw cluster-id? string | +--rw interface-id? string +--rw communication | +--rw communication-pattern? identityref | +--rw flow-direction? identityref | +--rw expected-flow-count? uint32 +--rw timing | +--rw requested-start-time? yang:date-and-time | +--rw latest-start-time? yang:date-and-time Xiong, et al. Expires 31 December 2026 [Page 4] Internet-Draft HPC/AI service intent June 2026 | +--rw requested-end-time? yang:date-and-time | +--rw deadline? yang:date-and-time | +--rw duration? uint32 | +--rw duration-unit? identityref +--rw service-objectives | +--rw data-volume? uint64 | +--rw data-volume-unit? identityref | +--rw bandwidth | | +--rw minimum-rate? uint64 | | +--rw maximum-rate? uint64 | | +--rw target-rate? uint64 | | +--rw rate-unit? identityref | +--rw latency | | +--rw maximum-latency? uint32 | | +--rw maximum-latency-variation? uint32 | | +--rw latency-unit? identityref | +--rw loss | | +--rw maximum-loss? decimal64 | | +--rw loss-unit? identityref | +--rw throughput? uint64 | +--rw throughput-unit? identityref +--rw policy-preferences | +--rw priority? uint32 | +--rw resilience-level? identityref | +--rw isolation-required? boolean | +--rw encryption-required? boolean | +--rw degrade-allowed? boolean | +--rw preemptible? boolean +--ro admission-state +--ro status? identityref +--ro decision-time? yang:date-and-time +--ro reason? string +--ro admitted-start-time? yang:date-and-time +--ro admitted-end-time? yang:date-and-time +--ro admitted-rate? uint64 +--ro realization-ref* string Figure 1: Service intent model structure 6. Relationship to Other Models The service intent model can refer to scheduler and job metadata using workload identifiers, job identifiers, or correlation identifiers. This allows a service request to remain independent of the originating scheduler while preserving traceability to the workload. Xiong, et al. Expires 31 December 2026 [Page 5] Internet-Draft HPC/AI service intent June 2026 Once an intent is admitted, one or more realization references can be returned. These references point to network realization state, such as tunnels or controller-managed service instances, without requiring the workload manager to configure those resources directly. 7. Open Issues and Design Considerations Future revisions need to refine the identity values for workload roles, communication patterns, resilience levels, data volume units, rate units, latency units, loss units, and admission status. The model also needs to define whether admission alternatives are represented inline or as separate candidate service intent instances. The model needs to clarify which workload-level attributes are essential to service intent and which are only correlation metadata provided by the scheduler or orchestration system. Additional requirements such as cost and fairness also need to be considered and, if in scope, defined in a form that is actionable by a controller. The lifecycle handling for create, update, activation, completion, suspension, and cancellation of requested HPC or AI services needs further definition. The model also needs to clarify how endpoint groups and communication patterns such as unicast, multicast, point- to-multipoint, and multipoint service requests are represented. Admission outcomes need further definition, including how accepted, modified, rejected, provisioned, completed, and failed states are returned to the requesting workload manager. 8. YANG Data Model The YANG data model is as follows: module ietf-hpc-service-intent { yang-version 1.1; namespace "urn:ietf:params:xml:ns:yang:ietf-hpc-service-intent"; prefix hpc-service; import ietf-yang-types { prefix yang; reference "RFC 6991: Common YANG Data Types"; } import ietf-inet-types { prefix inet; reference Xiong, et al. Expires 31 December 2026 [Page 6] Internet-Draft HPC/AI service intent June 2026 "RFC 6991: Common YANG Data Types"; } organization "IETF Traffic Engineering Architecture and Signaling (TEAS) Working Group"; contact "WG Web: WG List: Editor: Quan Xiong Editor: Kireeti Kompella Editor: Daniel King "; description "This module defines a common service intent model for High Performance Computing (HPC) and AI workloads over High Performance Wide Area Networks (HP-WANs). The model allows workload managers and orchestration platforms to express network service requirements without exposing technology-specific realization details. Copyright (c) 2026 IETF Trust and the persons identified as authors of the code. All rights reserved. Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Revised BSD License set forth in Section 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info). This version of this YANG module is part of RFC XXXX; see the RFC itself for full legal notices."; revision 2026-04-23 { description "Initial version of the HPC/AI service intent model."; reference "RFC XXXX: HPC/AI Service Intent Model"; } /* Xiong, et al. Expires 31 December 2026 [Page 7] Internet-Draft HPC/AI service intent June 2026 * Identity definitions */ identity endpoint-role { description "Base identity for endpoint roles."; } identity compute-node { base endpoint-role; description "Compute node endpoint role."; } identity storage-node { base endpoint-role; description "Storage node endpoint role."; } identity parameter-server { base endpoint-role; description "Parameter server endpoint role."; } identity communication-pattern { description "Base identity for communication patterns."; } identity unicast { base communication-pattern; description "Unicast communication pattern."; } identity multicast { base communication-pattern; description "Multicast communication pattern."; } identity broadcast { base communication-pattern; description "Broadcast communication pattern."; } Xiong, et al. Expires 31 December 2026 [Page 8] Internet-Draft HPC/AI service intent June 2026 identity all-to-all { base communication-pattern; description "All-to-all communication pattern."; } identity flow-direction { description "Base identity for flow directions."; } identity unidirectional { base flow-direction; description "Unidirectional flow."; } identity bidirectional { base flow-direction; description "Bidirectional flow."; } identity symmetric { base flow-direction; description "Symmetric bidirectional flow."; } identity duration-unit { description "Base identity for duration units."; } identity seconds { base duration-unit; description "Duration in seconds."; } identity minutes { base duration-unit; description "Duration in minutes."; } identity hours { base duration-unit; Xiong, et al. Expires 31 December 2026 [Page 9] Internet-Draft HPC/AI service intent June 2026 description "Duration in hours."; } identity data-volume-unit { description "Base identity for data volume units."; } identity bytes { base data-volume-unit; description "Data volume in bytes."; } identity kilobytes { base data-volume-unit; description "Data volume in kilobytes."; } identity megabytes { base data-volume-unit; description "Data volume in megabytes."; } identity gigabytes { base data-volume-unit; description "Data volume in gigabytes."; } identity terabytes { base data-volume-unit; description "Data volume in terabytes."; } identity rate-unit { description "Base identity for rate units."; } identity bps { base rate-unit; description "Bits per second."; Xiong, et al. Expires 31 December 2026 [Page 10] Internet-Draft HPC/AI service intent June 2026 } identity kbps { base rate-unit; description "Kilobits per second."; } identity mbps { base rate-unit; description "Megabits per second."; } identity gbps { base rate-unit; description "Gigabits per second."; } identity tbps { base rate-unit; description "Terabits per second."; } identity latency-unit { description "Base identity for latency units."; } identity microseconds { base latency-unit; description "Latency in microseconds."; } identity milliseconds { base latency-unit; description "Latency in milliseconds."; } identity loss-unit { description "Base identity for loss units."; } Xiong, et al. Expires 31 December 2026 [Page 11] Internet-Draft HPC/AI service intent June 2026 identity percentage { base loss-unit; description "Loss as percentage."; } identity parts-per-million { base loss-unit; description "Loss in parts per million."; } identity throughput-unit { description "Base identity for throughput units."; } identity packets-per-second { base throughput-unit; description "Throughput in packets per second."; } identity bytes-per-second { base throughput-unit; description "Throughput in bytes per second."; } identity resilience-level { description "Base identity for resilience levels."; } identity none { base resilience-level; description "No resilience required."; } identity path-protection { base resilience-level; description "Path protection resilience."; } identity node-protection { base resilience-level; Xiong, et al. Expires 31 December 2026 [Page 12] Internet-Draft HPC/AI service intent June 2026 description "Node protection resilience."; } identity link-protection { base resilience-level; description "Link protection resilience."; } identity admission-status { description "Base identity for admission status values."; } identity pending { base admission-status; description "Admission decision pending."; } identity accepted { base admission-status; description "Service intent accepted as requested."; } identity modified { base admission-status; description "Service intent accepted with modifications."; } identity rejected { base admission-status; description "Service intent rejected."; } identity provisioning { base admission-status; description "Service is being provisioned."; } identity active { base admission-status; description Xiong, et al. Expires 31 December 2026 [Page 13] Internet-Draft HPC/AI service intent June 2026 "Service is active and operational."; } identity completed { base admission-status; description "Service has completed successfully."; } identity failed { base admission-status; description "Service has failed."; } /* * Typedefs */ typedef priority-type { type uint32 { range "0..1000"; } description "Priority value type, with higher values indicating higher priority."; } /* * Groupings */ grouping workload-reference-grouping { description "Workload reference for correlating with scheduler job metadata."; leaf workload-id { type string; description "Reference to workload identifier from scheduler metadata."; } leaf job-id { type string; description "Reference to job identifier from scheduler metadata."; } leaf correlation-id { type string; description "Correlation identifier for cross-system tracing."; } } Xiong, et al. Expires 31 December 2026 [Page 14] Internet-Draft HPC/AI service intent June 2026 grouping endpoint-grouping { description "Endpoint identification and location information."; leaf endpoint-id { type string; mandatory true; description "Unique identifier for the endpoint within the group."; } leaf address { type inet:host; description "Network address of the endpoint."; } leaf site-id { type string; description "Site or data center identifier where the endpoint is located."; } leaf cluster-id { type string; description "Cluster identifier within the site."; } leaf interface-id { type string; description "Network interface identifier."; } } grouping endpoint-group-list-grouping { description "Group of endpoints with common role."; list endpoint-group { key "endpoint-group-id"; description "List of endpoint groups."; leaf endpoint-group-id { type string; description "Unique identifier for the endpoint group."; } leaf role { type identityref { base endpoint-role; } description Xiong, et al. Expires 31 December 2026 [Page 15] Internet-Draft HPC/AI service intent June 2026 "Functional role of the endpoints in this group."; } list endpoint { key "endpoint-id"; description "List of endpoints in the group."; uses endpoint-grouping; } } } grouping communication-grouping { description "Communication pattern and flow characteristics."; leaf communication-pattern { type identityref { base communication-pattern; } description "Pattern of communication between endpoints."; } leaf flow-direction { type identityref { base flow-direction; } description "Direction of data flow."; } leaf expected-flow-count { type uint32; description "Expected number of flows in this communication pattern."; } } grouping timing-grouping { description "Timing and scheduling requirements."; leaf requested-start-time { type yang:date-and-time; description "Requested start time for the service."; } leaf latest-start-time { type yang:date-and-time; description "Latest acceptable start time for the service."; } Xiong, et al. Expires 31 December 2026 [Page 16] Internet-Draft HPC/AI service intent June 2026 leaf requested-end-time { type yang:date-and-time; description "Requested completion time for the service."; } leaf deadline { type yang:date-and-time; description "Absolute deadline for service completion."; } leaf duration { type uint32; description "Requested duration for the service."; } leaf duration-unit { type identityref { base duration-unit; } description "Unit for the requested duration."; } } grouping bandwidth-grouping { description "Bandwidth rate requirements."; leaf minimum-rate { type uint64; description "Minimum acceptable bandwidth rate."; } leaf maximum-rate { type uint64; description "Maximum allowed bandwidth rate."; } leaf target-rate { type uint64; description "Target or desired bandwidth rate."; } leaf rate-unit { type identityref { base rate-unit; } description "Unit for bandwidth rates."; Xiong, et al. Expires 31 December 2026 [Page 17] Internet-Draft HPC/AI service intent June 2026 } } grouping latency-grouping { description "Latency performance requirements."; leaf maximum-latency { type uint32; description "Maximum acceptable latency."; } leaf maximum-latency-variation { type uint32; description "Maximum acceptable latency variation (jitter)."; } leaf latency-unit { type identityref { base latency-unit; } description "Unit for latency values."; } } grouping loss-grouping { description "Loss performance requirements."; leaf maximum-loss { type decimal64 { fraction-digits 6; } description "Maximum acceptable loss rate."; } leaf loss-unit { type identityref { base loss-unit; } description "Unit for loss rate."; } } grouping service-objectives-grouping { description "Service performance objectives."; leaf data-volume { Xiong, et al. Expires 31 December 2026 [Page 18] Internet-Draft HPC/AI service intent June 2026 type uint64; description "Expected data volume to be transferred."; } leaf data-volume-unit { type identityref { base data-volume-unit; } description "Unit for data volume."; } container bandwidth { description "Bandwidth rate requirements."; uses bandwidth-grouping; } container latency { description "Latency performance requirements."; uses latency-grouping; } container loss { description "Loss performance requirements."; uses loss-grouping; } leaf throughput { type uint64; description "Required throughput performance."; } leaf throughput-unit { type identityref { base throughput-unit; } description "Unit for throughput values."; } } grouping policy-preferences-grouping { description "Policy preferences for the service."; leaf priority { type priority-type; description "Priority level for the service."; } Xiong, et al. Expires 31 December 2026 [Page 19] Internet-Draft HPC/AI service intent June 2026 leaf resilience-level { type identityref { base resilience-level; } description "Required resilience level for the service."; } leaf isolation-required { type boolean; description "Whether traffic isolation is required."; } leaf encryption-required { type boolean; description "Whether encryption is required."; } leaf degrade-allowed { type boolean; description "Whether service degradation is allowed if full requirements cannot be met."; } leaf preemptible { type boolean; description "Whether the service can be preempted by higher priority services."; } } grouping admission-state-grouping { description "Admission control state and decision."; leaf status { type identityref { base admission-status; } description "Current admission status of the service intent."; } leaf decision-time { type yang:date-and-time; description "Time when the admission decision was made."; } leaf reason { type string; description "Reason for the admission decision."; Xiong, et al. Expires 31 December 2026 [Page 20] Internet-Draft HPC/AI service intent June 2026 } leaf admitted-start-time { type yang:date-and-time; description "Admitted start time for the service."; } leaf admitted-end-time { type yang:date-and-time; description "Admitted end time for the service."; } leaf admitted-rate { type uint64; description "Admitted bandwidth rate for the service."; } leaf-list realization-ref { type string; description "References to network realization instances fulfilling this service intent."; } } /* * Top-level container */ container hpc-service-intent { description "Top-level container for HPC/AI service intent."; list intent { key "intent-id"; description "List of service intent instances."; leaf intent-id { type string; description "Unique identifier for the service intent instance."; } leaf enable { type boolean; description "Administrative state of the service intent."; } container workload-ref { Xiong, et al. Expires 31 December 2026 [Page 21] Internet-Draft HPC/AI service intent June 2026 description "Reference to workload metadata."; uses workload-reference-grouping; } container endpoints { description "Endpoint definitions for the service."; uses endpoint-group-list-grouping; } container communication { description "Communication pattern and flow characteristics."; uses communication-grouping; } container timing { description "Timing and scheduling requirements."; uses timing-grouping; } container service-objectives { description "Service performance objectives."; uses service-objectives-grouping; } container policy-preferences { description "Policy preferences for the service."; uses policy-preferences-grouping; } container admission-state { config false; description "Admission control state and decision (read-only)."; uses admission-state-grouping; } } } } Xiong, et al. Expires 31 December 2026 [Page 22] Internet-Draft HPC/AI service intent June 2026 9. Security Considerations Service intent information can reveal endpoint locations, timing, capacity requirements, data movement patterns, and workload sensitivity. Implementations need to authenticate and authorize entities that create, read, modify, or cancel service intent instances. Transport protection and access control are required when this model is used across administrative or trust boundaries. 10. IANA Considerations IANA is requested to register one URI in the "IETF XML Registry" [RFC3688]. Following the format in [RFC3688], the following registration is requested: URI: urn:ietf:params:xml:ns:yang:ietf-hpc-service-intent Registrant Contact: The IESG. XML: N/A; the requested URI is an XML namespace. IANA is requested to register the following YANG module in the "YANG Module Names" registry [RFC6020]. name: ietf-hpc-scheduler-job-metadata namespace: urn:ietf:params:xml:ns:yang:ietf-hpc-service-intent prefix: hpc-service reference: RFC XXXX 11. Acknowledgements The authors acknowledge the related HP-WAN framework and problem statement work that provides the broader context for this service intent model. 12. References 12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . Xiong, et al. Expires 31 December 2026 [Page 23] Internet-Draft HPC/AI service intent June 2026 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . 12.2. Informative References [I-D.kcrh-hpwan-state-of-art] King, D., Chown, T., Rapier, C., Huang, D., and K. Yao, "Current State of the Art for High Performance Wide Area Networks", Work in Progress, Internet-Draft, draft-kcrh- hpwan-state-of-art-03, 20 October 2025, . [I-D.kompella-rtgwg-mlnwsched] Kompella, K., Beeram, V. P., Mahale, A., Bhargava, R., and N. Geyer, "Scheduling Network Resources for Machine Learning Clusters", Work in Progress, Internet-Draft, draft-kompella-rtgwg-mlnwsched-02, 1 March 2026, . [I-D.xhy-hpwan-framework] Xiong, Q., Huang, G., Yao, K., and C. Lin, "Framework for High Performance Wide Area Network (HP-WAN)", Work in Progress, Internet-Draft, draft-xhy-hpwan-framework-03, 20 October 2025, . [I-D.xiong-hpwan-problem-statement] Xiong, Q., Yao, K., Huang, C., Han, Z., and J. Zhao, "Problem Statement for High Performance Wide Area Networks", Work in Progress, Internet-Draft, draft-xiong- hpwan-problem-statement-03, 25 February 2025, . [I-D.xkk-teas-hpc-scheduler-job-metadata] Xiong, Q., Kompella, K., and D. King, "HPC/AI Scheduler Job Metadata Model", Work in Progress, Internet-Draft, draft-xkk-teas-hpc-scheduler-job-metadata-00, 23 April 2026, . Xiong, et al. Expires 31 December 2026 [Page 24] Internet-Draft HPC/AI service intent June 2026 Appendix A. Example This section provides an example of a service intent instance for a distributed AI training workload. The example demonstrates how workload requirements are expressed using the service intent model. Consider a scenario where an AI training job requires communication between multiple compute nodes across two data centers. The job involves parameter synchronization between worker nodes and requires guaranteed bandwidth with low latency. { "ietf-hpc-service-intent:hpc-service-intent": { "intent": [ { "intent-id": "ai-training-job-2026-001", "enable": true, "workload-ref": { "workload-id": "distributed-training-001", "job-id": "job-2026-04-23-001", "correlation-id": "corr-ai-training-001" }, "endpoints": { "endpoint-group": [ { "endpoint-group-id": "worker-nodes", "role": "compute-node", "endpoint": [ { "endpoint-id": "worker-1", "address": "192.0.2.10", "site-id": "dc-west", "cluster-id": "gpu-cluster-1" }, { "endpoint-id": "worker-2", "address": "192.0.2.11", "site-id": "dc-west", "cluster-id": "gpu-cluster-1" }, { "endpoint-id": "worker-3", "address": "198.51.100.20", "site-id": "dc-east", "cluster-id": "gpu-cluster-2" } Xiong, et al. Expires 31 December 2026 [Page 25] Internet-Draft HPC/AI service intent June 2026 ] } ] }, "communication": { "communication-pattern": "all-to-all", "flow-direction": "bidirectional", "expected-flow-count": 6 }, "timing": { "requested-start-time": "2026-04-23T10:00:00Z", "latest-start-time": "2026-04-23T10:15:00Z", "deadline": "2026-04-23T12:00:00Z", "duration": 120, "duration-unit": "minutes" }, "service-objectives": { "data-volume": 500, "data-volume-unit": "gigabytes", "bandwidth": { "minimum-rate": 100, "target-rate": 200, "maximum-rate": 400, "rate-unit": "gbps" }, "latency": { "maximum-latency": 5, "maximum-latency-variation": 1, "latency-unit": "milliseconds" }, "loss": { "maximum-loss": 0.0001, "loss-unit": "percentage" } }, "policy-preferences": { "priority": 100, "resilience-level": "path-protection", "isolation-required": true, "encryption-required": true, "degrade-allowed": false, "preemptible": false }, "admission-state": { "status": "pending", "decision-time": "2026-04-23T09:45:00Z" } } Xiong, et al. Expires 31 December 2026 [Page 26] Internet-Draft HPC/AI service intent June 2026 ] } } Authors' Addresses Quan Xiong ZTE Corporation Email: xiong.quan@zte.com.cn Kireeti Kompella HPE Email: kireeti.ietf@gmail.com Daniel King Lancaster University Email: d.king@lancaster.ac.uk Xiong, et al. Expires 31 December 2026 [Page 27]