<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>批处理 on 黄文卓 | DevOps Engineer</title>
    <link>https://socake.github.io/tags/%E6%89%B9%E5%A4%84%E7%90%86/</link>
    <description>Recent content in 批处理 on 黄文卓 | DevOps Engineer</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>zh-CN</language>
    <managingEditor>17691281867@163.com (Wenzhuo Huang)</managingEditor>
    <webMaster>17691281867@163.com (Wenzhuo Huang)</webMaster>
    <copyright>© 2026 Wenzhuo Huang</copyright>
    <lastBuildDate>Sun, 12 Apr 2026 11:00:00 +0800</lastBuildDate><atom:link href="https://socake.github.io/tags/%E6%89%B9%E5%A4%84%E7%90%86/index.xml" rel="self" type="application/rss+xml" />
    
    <item>
      <title>Argo Workflows 工作流实战：批处理与 ML Pipeline</title>
      <link>https://socake.github.io/posts/argo-workflows-practice/</link>
      <pubDate>Sun, 12 Apr 2026 11:00:00 +0800</pubDate>
      <author>17691281867@163.com (Wenzhuo Huang)</author>
      <guid>https://socake.github.io/posts/argo-workflows-practice/</guid>
      <description>Argo Workflows 是 Kubernetes 原生的工作流引擎，适合批处理和 ML Pipeline 场景。本文涵盖与 Airflow/Temporal 的选型对比、核心资源模型、三个完整实战（DAG 数据处理、ML 训练 Pipeline、定时备份）、资源管控（Semaphore/Node Selector）、Argo Events 事件驱动触发，以及 Prometheus 监控和常见问题处理。</description>
      <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/argo-workflows-practice/featured.jpg" />
    </item>
    
    <item>
      <title>Volcano 批调度实战：AI 训练集群的 Gang Scheduling、队列与抢占</title>
      <link>https://socake.github.io/posts/volcano-gpu-batch-scheduling/</link>
      <pubDate>Wed, 25 Mar 2026 15:30:00 +0800</pubDate>
      <author>17691281867@163.com (Wenzhuo Huang)</author>
      <guid>https://socake.github.io/posts/volcano-gpu-batch-scheduling/</guid>
      <description>K8s 默认调度器对 AI 训练极不友好。Volcano 把 HPC 调度理念搬进 K8s：Gang Scheduling、Queue、Fairshare、Preemption、拓扑亲和。这篇讲清楚它在 AI 训练集群的落地。</description>
      <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/volcano-gpu-batch-scheduling/featured.jpg" />
    </item>
    
    <item>
      <title>Kueue 批处理调度实战：让 Kubernetes 真正承担 AI/HPC 工作负载</title>
      <link>https://socake.github.io/posts/kueue-batch-workload/</link>
      <pubDate>Sat, 15 Mar 2025 09:40:00 +0800</pubDate>
      <author>17691281867@163.com (Wenzhuo Huang)</author>
      <guid>https://socake.github.io/posts/kueue-batch-workload/</guid>
      <description>把 AI 训练任务塞进 Kubernetes，第一天你会发现原生调度器完全不够用：没有队列、没有 quota、没有 gang scheduling、没有公平共享、preemption 语义一塌糊涂。Kueue 是 sig-scheduling 官方给出的答案，它比 Volcano 更贴近 Kubernetes 原生、比自研 controller 更成熟。这是一份真实的生产笔记。</description>
      <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://socake.github.io/posts/kueue-batch-workload/featured.jpg" />
    </item>
    
  </channel>
</rss>
