<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <author>
    <name>John Doe</name>
  </author>
  <generator uri="https://hexo.io/">Hexo</generator>
  <id>http://example.com/</id>
  <link href="http://example.com/" rel="alternate"/>
  <link href="http://example.com/atom.xml" rel="self"/>
  <rights>All rights reserved 2026, John Doe</rights>
  <title>Hexo</title>
  <updated>2026-04-28T13:16:41.646Z</updated>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>内联,重要性不必多说。</p><p>基本思路不是很难，</p><ol><li>module层面构建call graph</li><li>根据call graph根据SCC（强连通分量</li><li>遍历SCC尝试内联（注意递归函数） 在同一个SCC里面就是一个递归链</li><li>决定内联时候：<ol><li>clone被内联函数，用valuemap[old] &#x3D; new记录映射情况。</li><li>被clone内容加入caller中</li><li>更新下ssa的def use chain。</li></ol></li></ol><p>主要是各种cost和gain的评估，各种启发式。</p>]]>
    </content>
    <id>http://example.com/2026/04/28/inline/</id>
    <link href="http://example.com/2026/04/28/inline/"/>
    <published>2026-04-28T13:16:41.646Z</published>
    <summary>
      <![CDATA[<p>内联,重要性不必多说。</p>
<p>基本思路不是很难，</p>
<ol>
<li>module层面构建call graph</li>
<li>根据call graph根据SCC（强连通分量</li>
<li>遍历SCC尝试内联（注意递归函数） 在同一个SCC里面就是一个递]]>
    </summary>
    <title>inline</title>
    <updated>2026-04-28T13:16:41.646Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">asm</span> <span class="title function_">volatile</span> <span class="params">(</span></span><br><span class="line"><span class="params">    <span class="string">&quot;asm template&quot;</span></span></span><br><span class="line"><span class="params">    : output_operands        <span class="comment">// ← 第一个冒号后</span></span></span><br><span class="line"><span class="params">    : input_operands         <span class="comment">// ← 第二个冒号后</span></span></span><br><span class="line"><span class="params">    : clobbers               <span class="comment">// ← 第三个冒号后</span></span></span><br><span class="line"><span class="params">)</span>;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>example:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdint.h&gt;</span></span></span><br><span class="line"></span><br><span class="line">__attribute__((noinline))</span><br><span class="line"><span class="type">int</span> <span class="title function_">mul_add</span><span class="params">(</span></span><br><span class="line"><span class="params">    <span class="type">int</span> a, <span class="type">int</span> b, <span class="type">int</span> c,</span></span><br><span class="line"><span class="params">    <span class="type">int</span> *sum,</span></span><br><span class="line"><span class="params">    <span class="type">int</span> *acc)</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">asm</span> <span class="title function_">volatile</span> <span class="params">(</span></span><br><span class="line"><span class="params">        <span class="string">&quot;imull  %[b], %[a]\n\t&quot;</span></span></span><br><span class="line"><span class="params">        <span class="string">&quot;leal   (%[a], %[c]), %[s]\n\t&quot;</span></span></span><br><span class="line"><span class="params">        <span class="string">&quot;addl   %[a], %[acc]\n\t&quot;</span></span></span><br><span class="line"><span class="params">        : [a]   <span class="string">&quot;+r&quot;</span>(a),        <span class="comment">// read-write</span></span></span><br><span class="line"><span class="params">          [s]   <span class="string">&quot;=r&quot;</span>(*sum),     <span class="comment">// write-only</span></span></span><br><span class="line"><span class="params">          [acc] <span class="string">&quot;+r&quot;</span>(*acc)      <span class="comment">// read-write</span></span></span><br><span class="line"><span class="params">        : [b]   <span class="string">&quot;r&quot;</span>(b),</span></span><br><span class="line"><span class="params">          [c]   <span class="string">&quot;r&quot;</span>(c)</span></span><br><span class="line"><span class="params">        : <span class="string">&quot;cc&quot;</span>, <span class="string">&quot;memory&quot;</span></span></span><br><span class="line"><span class="params">    )</span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> a;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p><code>%[b]</code> 是命名参数, 与后面的<code>[a]</code>对应。</p><p><code>&quot;+r&quot;</code> read-write<br><code>&quot;=r&quot;</code> write only, 新定义一次，不读旧值<br><code>&quot;r&quot;</code>  read only，只读</p>]]>
    </content>
    <id>http://example.com/2026/04/28/inline_asm/</id>
    <link href="http://example.com/2026/04/28/inline_asm/"/>
    <published>2026-04-28T13:16:41.646Z</published>
    <summary>
      <![CDATA[<figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="li]]>
    </summary>
    <title>inline asm</title>
    <updated>2026-04-28T13:16:41.646Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>步骤如下：</p><ol><li>本地安装zola</li><li>运行<code>zola init &lt;your blog dir&gt;</code>, like <code>zola init myblog</code></li></ol><p>目录如下：</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">myblog/</span><br><span class="line">├── config.toml          # 配置文件</span><br><span class="line">├── content/             # 内容文件（Markdown）</span><br><span class="line">├── sass/                # Sass 样式文件（可选）</span><br><span class="line">├── static/              # 静态资源（图片、CSS、JS 等）</span><br><span class="line">├── templates/           # 模板文件（HTML）</span><br><span class="line">└── themes/              # 主题文件（可选）</span><br></pre></td></tr></table></figure><ul><li>创建的博客就放在content下面。</li><li>图片放在static&#x2F;images下面，在博客里用<code>![test](/images/&lt;xxx&gt;.png)</code>引用</li><li>博客格式：<code>+++</code>一定要有</li></ul><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">---</span><br><span class="line">title = &quot;我的第一篇文章&quot;</span><br><span class="line">---</span><br><span class="line"></span><br><span class="line">## hello zola</span><br></pre></td></tr></table></figure><ol start="3"><li><code>cd &lt;your blog dir&gt; &amp;&amp; git init</code>将这些目录加到版本控制进去。</li><li>添加自定义模板<code>teplates/index.html</code>和<code>teplates/page.html</code></li><li>创建github actions: <code>.github/workflows/deploy.yml</code><br>内容如下：</li></ol><figure class="highlight yml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">name:</span> <span class="string">Deploy</span> <span class="string">Zola</span> <span class="string">site</span> <span class="string">to</span> <span class="string">GitHub</span> <span class="string">Pages</span></span><br><span class="line"></span><br><span class="line"><span class="attr">on:</span></span><br><span class="line">  <span class="attr">push:</span></span><br><span class="line">    <span class="attr">branches:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">master</span>  <span class="comment"># 触发部署的分支</span></span><br><span class="line"></span><br><span class="line"><span class="attr">jobs:</span></span><br><span class="line">  <span class="attr">deploy:</span></span><br><span class="line">    <span class="attr">runs-on:</span> <span class="string">ubuntu-latest</span></span><br><span class="line">    <span class="attr">steps:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Checkout</span></span><br><span class="line">        <span class="attr">uses:</span> <span class="string">actions/checkout@v3</span></span><br><span class="line"></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Install</span> <span class="string">Zola</span></span><br><span class="line">        <span class="attr">run:</span> <span class="string">|</span></span><br><span class="line"><span class="string">          wget https://github.com/getzola/zola/releases/download/v0.19.2/zola-v0.19.2-x86_64-unknown-linux-gnu.tar.gz</span></span><br><span class="line"><span class="string">          tar -xzf zola-v0.19.2-x86_64-unknown-linux-gnu.tar.gz</span></span><br><span class="line"><span class="string">          sudo mv zola /usr/local/bin</span></span><br><span class="line"><span class="string"></span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Build</span> <span class="string">site</span></span><br><span class="line">        <span class="attr">run:</span> <span class="string">zola</span> <span class="string">build</span></span><br><span class="line"></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">Deploy</span> <span class="string">to</span> <span class="string">GitHub</span> <span class="string">Pages</span></span><br><span class="line">        <span class="attr">uses:</span> <span class="string">shalzz/zola-deploy-action@master</span></span><br><span class="line">        <span class="attr">env:</span></span><br><span class="line">            <span class="attr">PAGES_BRANCH:</span> <span class="string">gh-pages</span></span><br><span class="line">            <span class="attr">TOKEN:</span> <span class="string">$&#123;&#123;</span> <span class="string">secrets.GITHUB_TOKEN</span> <span class="string">&#125;&#125;</span></span><br></pre></td></tr></table></figure><p>这里注意分支。</p><ul><li>master分支是我们源文件的分支</li><li>gh-pages分支是制品分支</li></ul><blockquote><p>参考：<a href="https://www.getzola.org/documentation/deployment/github-pages/#github-actions">zola github-actions</a></p></blockquote><ol start="6"><li><p>推送到 <code>git@github.com:&lt;username&gt;/&lt;username&gt;.github.io.git</code></p></li><li><p>设置github pages:<br><img src="/static/images/image-github-pages.png" alt="alt text"></p></li><li><p>如果上传图片，设置<code>config.toml</code>中<code>base_url = &quot;https://&lt;username&gt;.github.io/&quot;</code></p></li><li><p>等待github actions完成，就OK了</p></li></ol><hr><h1 id="部署静态网址到服务器"><a href="#部署静态网址到服务器" class="headerlink" title="部署静态网址到服务器"></a>部署静态网址到服务器</h1><p>需要</p><ol><li>一个云服务器，带ip地址&#x2F;域名的</li></ol><p>步骤如下</p><ol><li>在云服务器安装zola（pre-build比较好）</li><li><strong>在服务器安装nginx</strong></li><li>拉取博客代码到服务器：<code>git clone example_blog@git</code></li><li>运行命令<code>cd example_blog &amp;&amp; zola build -u &quot;&quot;</code>. 构建好的文件在<code>example_blog/public</code>下</li><li><code>cp public /var/www/static-site -r</code></li><li>给nginx权限并确保自己有权限<code>sudo chown -R $USER:$USER /var/www/static-site &amp;&amp; sudo chown -R www-data:www-data /var/www/static-site &amp;&amp; sudo chmod -R 755 /var/www/static-site</code></li><li>修改nginx的配置文件（ai生成的）<br>- 新建配置文件<code>/etc/nginx/sites-available/static-site.conf</code><br>- 添加内容</li></ol><pre><code><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">server &#123;</span><br><span class="line">  listen 80;</span><br><span class="line">  server_name example.com;  # 替换为你的域名或服务器 IP（如 192.168.1.100）</span><br><span class="line"></span><br><span class="line">  root /var/www/static-site;  # 静态文件根目录</span><br><span class="line">  index index.html index.htm;  # 默认索引文件</span><br><span class="line"></span><br><span class="line">  location / &#123;</span><br><span class="line">      try_files $uri $uri/ =404;  # 按顺序查找文件或目录，否则返回 404</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  # 可选：启用 Gzip 压缩</span><br><span class="line">  gzip on;</span><br><span class="line">  gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>- 设置符号链接`sudo ln -s /etc/nginx/sites-available/static-site.conf /etc/nginx/sites-enabled/`</code></pre><ol start="8"><li>重启nginx  <code>sudo nginx -t &amp;&amp; sudo systemctl reload nginx</code></li></ol><blockquote><p>坑</p><ol><li>本来打算用nginx做为动态转发，服务器后台跑个zola serve，结果发现zola build出来的都有端口<code>ip:1111/xxx</code>. 此路不通</li><li>然后就用静态网页。最开始有403，权限问题。然后发现博客的url链接是<code>ip/ip/xxx</code>, 用<code>zola build -u &quot;&quot;</code>可以处理掉。然后就OK啦</li></ol></blockquote><hr><p>之前用hexo，但图片上传太麻烦了:( zola图片上传也没有那么方便。<br>ai真好用，特别是这种琐碎的安装部署。</p>]]>
    </content>
    <id>http://example.com/2026/04/28/deploy/</id>
    <link href="http://example.com/2026/04/28/deploy/"/>
    <published>2026-04-28T13:16:41.644Z</published>
    <summary>
      <![CDATA[<p>步骤如下：</p>
<ol>
<li>本地安装zola</li>
<li>运行<code>zola init &lt;your blog dir&gt;</code>, like <code>zola init myblog</code></li>
</ol>
<p>目录如]]>
    </summary>
    <title>使用zola设置github.io作为个人博客</title>
    <updated>2026-04-28T13:16:41.644Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>这里首先需要区分下树和图的遍历，只考虑dfs方式</p><h1 id="树"><a href="#树" class="headerlink" title="树"></a>树</h1><p>以二叉树为例子：<br>dfs有pre-order，in-order，post-order。</p><h1 id="图"><a href="#图" class="headerlink" title="图"></a>图</h1><blockquote><p>只考虑有向图，一个start节点</p></blockquote><h2 id="拓扑排序"><a href="#拓扑排序" class="headerlink" title="拓扑排序"></a>拓扑排序</h2><p>考虑到图可能存在环，问题会有些复杂。</p><ol><li>先考虑拓扑排序，有两种情况</li></ol><ul><li>DAG情况（无环），不断删除入度为0的节点或者用 reverse post order方式。</li></ul><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">图1:</span><br><span class="line">    Entry</span><br><span class="line">    /    \</span><br><span class="line">    A    B</span><br><span class="line">    \   /</span><br><span class="line">    Exit</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>pre-order: [Entry, A, Exit, B]<br>post-order: [Exit, A, B, Entry]<br>reverse-post-order（RPO）：[Entry, B, A, Exit]就是这个DAG的拓扑排序。</p><blockquote><p>RPO还是和pre-order不一样的。</p></blockquote><ul><li>有环（loop）：其实就没拓扑排序的概念了，这个循环是绕不开的。</li></ul><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">图2:</span><br><span class="line">        Entry</span><br><span class="line">          │</span><br><span class="line">          ▼</span><br><span class="line">          A ◄────┐</span><br><span class="line">        ↙   ↘    │</span><br><span class="line">       B     C   │</span><br><span class="line">       │     │   │</span><br><span class="line">       │     ▼   │</span><br><span class="line">       │     D   │</span><br><span class="line">       │     │   │</span><br><span class="line">       │     ▼   │</span><br><span class="line">       │     E   │</span><br><span class="line">        ↘   ↙    │</span><br><span class="line">          F ─────┘ (back edge F → A)</span><br><span class="line">          │</span><br><span class="line">          ▼</span><br><span class="line">         Exit</span><br></pre></td></tr></table></figure><p>pre-order: [Entry, A, B, F, Exit, C, D, E]<br>post-order: [Exit, F, B, E, D, C, A, Entry]<br>reverser-post-order: [Entry, A, C, D, E, B, F, Exit]</p><h1 id="数据流分析"><a href="#数据流分析" class="headerlink" title="数据流分析"></a>数据流分析</h1><p>有两种信息的流向</p><ol><li>前向，从entry到exit，例如图2从A-&gt;B, A-&gt;C</li><li>后向，从exit到entry, 例如图2从B-&gt;A, C-&gt;A</li></ol><p>对于图1，一次RPO就能完成前向分析。对于图2，一次RPO不能迭代到不动点，所以还是需要多次迭代。<br>至于后向分析，有两种思路1.对原图进行post order；2.对源图求逆，然后进行rpo<a href="https://eli.thegreenplace.net/2015/directed-graph-traversal-orderings-and-applications-to-data-flow-analysis/">1</a>。</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">图3:</span><br><span class="line"></span><br><span class="line">A  --&gt; B --&gt; D</span><br><span class="line">      | ^</span><br><span class="line">      v |</span><br><span class="line">       C</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">---------------------</span><br><span class="line"></span><br><span class="line">图4：</span><br><span class="line"></span><br><span class="line">A  &lt;-- B &lt;--- D</span><br><span class="line">      ^ |</span><br><span class="line">      | v</span><br><span class="line">       C</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>图4是图3的逆,<br>对图3进行post-order：D，C，B，A。<br>对图4进行RPO：D，B，C，A     </p><h2 id="性能"><a href="#性能" class="headerlink" title="性能"></a>性能</h2><p>不管前向后向，顺序遍历肯定是可行方法。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">bool</span> changed = <span class="literal">false</span>;</span><br><span class="line"><span class="keyword">do</span>&#123;</span><br><span class="line">    chaneged = <span class="literal">false</span>;</span><br><span class="line">    <span class="keyword">for</span>(<span class="keyword">auto</span> block: func) &#123;</span><br><span class="line">        changed |= ...</span><br><span class="line">    &#125;</span><br><span class="line">&#125;<span class="keyword">while</span>(changed)</span><br></pre></td></tr></table></figure><p>对于前向分析，DAG情况下RPO肯定块，loop情况下我没实测过，但考虑到分支众多，RPO<code>应该</code>比顺序遍历要快吧。<br>逆向分析下，我也没测试过。。。</p><h1 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h1><p><a href="https://eli.thegreenplace.net/2015/directed-graph-traversal-orderings-and-applications-to-data-flow-analysis/">https://eli.thegreenplace.net/2015/directed-graph-traversal-orderings-and-applications-to-data-flow-analysis/</a></p>]]>
    </content>
    <id>http://example.com/2026/04/28/rpo/</id>
    <link href="http://example.com/2026/04/28/rpo/"/>
    <published>2026-04-28T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>这里首先需要区分下树和图的遍历，只考虑dfs方式</p>
<h1 id="树"><a href="#树" class="headerlink" title="树"></a>树</h1><p>以二叉树为例子：<br>dfs有pre-order，in-order，post-or]]>
    </summary>
    <title>rpo</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<h1 id="体系结构"><a href="#体系结构" class="headerlink" title="体系结构"></a>体系结构</h1><ul><li>计算机体系结构量化研究方法</li></ul><h1 id="编译器相关"><a href="#编译器相关" class="headerlink" title="编译器相关"></a>编译器相关</h1><h3 id="比较推荐的书"><a href="#比较推荐的书" class="headerlink" title="比较推荐的书"></a>比较推荐的书</h3><ol><li>高级编译器设计与实现 (鲸书)</li><li>ssa based compiler design</li><li>深入理解llvm代码生成  (彭成寒等著)</li><li>LLVM Code Generation A deep dive into compiler backend development (Quentin Colombet)</li><li>编译器设计 (Engineering a Compiler)</li><li><a href="https://gcc.godbolt.org/">https://gcc.godbolt.org/</a>  <a href="https://alive2.llvm.org/ce/">https://alive2.llvm.org/ce/</a></li><li>龙书：大而全，前端太多了，导致后端没耐心看了，其实后端优化写的挺好的</li></ol><hr><h3 id="次一级的书"><a href="#次一级的书" class="headerlink" title="次一级的书"></a>次一级的书</h3><ul><li>虎书：之前买过一本，感觉越后面越糊</li><li>llvm cookbook：入门书</li></ul><hr><p>机器无关优化看鲸书和编译器设计。至于优化的具体实现，llvm+ai。论文一般都很老了，快速看看就行。</p><h1 id="TODO"><a href="#TODO" class="headerlink" title="TODO"></a>TODO</h1><ul><li>Nonlinear Dynamics And Chaos</li></ul>]]>
    </content>
    <id>http://example.com/2026/04/22/books2/</id>
    <link href="http://example.com/2026/04/22/books2/"/>
    <published>2026-04-22T00:00:00.000Z</published>
    <summary>
      <![CDATA[<h1 id="体系结构"><a href="#体系结构" class="headerlink" title="体系结构"></a>体系结构</h1><ul>
<li>计算机体系结构量化研究方法</li>
</ul>
<h1 id="编译器相关"><a href="#编译器相关"]]>
    </summary>
    <title>书籍阅读</title>
    <updated>2026-04-28T13:16:41.644Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<h1 id="0"><a href="#0" class="headerlink" title="0"></a>0</h1><ol><li>构建llvm-tblgen， cmake configure后 <code>cd build &amp;&amp; ninja llvm-tblgen</code></li><li>查看tblgen对某个td文件的实际命令：在build.ninja里面搜索。<br>比如说 X86GenAsmMatcher.inc是怎么编译出来的.</li></ol><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"># Custom command for lib\Target\X86\X86GenAsmMatcher.inc</span><br><span class="line"></span><br><span class="line">build lib/Target/X86/X86GenAsmMatcher.inc | $&#123;cmake_ninja_workdir&#125;lib/Target/X86/X86GenAsmMatcher.inc: CUSTOM_COMMAND bin/llvm-tblgen.exe bin/llvm-tblgen.exe D$:/Code/llvm-project/llvm/lib/Target/X86/X86.td || bin/llvm-min-tblgen.exe bin/llvm-tblgen.exe include/llvm/CodeGen/vt_gen include/llvm/IR/intrinsics_gen lib/LLVMCodeGenTypes.lib lib/LLVMDemangle.lib lib/LLVMSupport.lib lib/LLVMTableGen.lib lib/Support/BLAKE3/LLVMSupportBlake3 utils/TableGen/Basic/obj.LLVMTableGenBasic utils/TableGen/Common/obj.LLVMTableGenCommon</span><br><span class="line">  COMMAND = C:\WINDOWS\system32\cmd.exe /C &quot;cd /D D:\Code\llvm-project\build\lib\Target\X86 &amp;&amp; D:\Code\llvm-project\build\bin\llvm-tblgen.exe -gen-asm-matcher -ID:/Code/llvm-project/llvm/lib/Target/X86 -ID:/Code/llvm-project/build/include -ID:/Code/llvm-project/llvm/include -I D:/Code/llvm-project/llvm/lib/Target D:/Code/llvm-project/llvm/lib/Target/X86/X86.td --write-if-changed -o X86GenAsmMatcher.inc -d X86GenAsmMatcher.inc.d &amp;&amp; &quot;D:\Program Files\bin\cmake.exe&quot; -E cmake_transform_depfile Ninja gccdepfile D:/Code/llvm-project/llvm D:/Code/llvm-project/llvm/lib/Target/X86 D:/Code/llvm-project/build D:/Code/llvm-project/build/lib/Target/X86 D:/Code/llvm-project/build/lib/Target/X86/X86GenAsmMatcher.inc.d D:/Code/llvm-project/build/CMakeFiles/d/c78dca343be67cfabd341efa9fb0fbe70dafdeceb081e5cb7ee12866c0dd9055.d&quot;</span><br><span class="line">  DESC = Building X86GenAsmMatcher.inc...</span><br><span class="line">  depfile = CMakeFiles\d\c78dca343be67cfabd341efa9fb0fbe70dafdeceb081e5cb7ee12866c0dd9055.d</span><br><span class="line">  deps = gcc</span><br><span class="line">  restat = 1</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>实际的命令就是<code>D:/Code/llvm-project/build/bin/llvm-tblgen.exe -gen-asm-matcher -ID:/Code/llvm-project/llvm/lib/Target/X86 -ID:/Code/llvm-project/build/include -ID:/Code/llvm-project/llvm/include -I D:/Code/l lvm-project/llvm/lib/Target D:/Code/llvm-project/llvm/lib/Target/X86/X86.td --write-if-changed -o X86GenAsmMatcher.inc</code>. -gen-asm-matcher就是生成具体代码实现。</p><ol start="3"><li>生成record：删除掉<code>-gen-asm-matcher</code>就行。record文件巨大会卡。</li></ol><h1 id="1-Basic"><a href="#1-Basic" class="headerlink" title="1. Basic"></a>1. Basic</h1><p>基本流程 .td —&gt; records —&gt; .inc</p><h2 id="class，def"><a href="#class，def" class="headerlink" title="class，def"></a>class，def</h2><ul><li>class： 和cpp里的类差不多,定义一个类型</li><li>def：一个record实例，类似定义一个cpp变量</li></ul><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">class A&#123;</span><br><span class="line">string fromA = &quot;From A&quot;;</span><br><span class="line">&#125;</span><br><span class="line">class B &#123;</span><br><span class="line">int num = 10;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">def X0: A, B&#123;&#125;</span><br><span class="line">def X1: A, B&#123;&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>结果是：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">------------- Classes -----------------</span><br><span class="line">class A &#123;</span><br><span class="line">  string fromA = &quot;From A&quot;;</span><br><span class="line">&#125;</span><br><span class="line">class B &#123;</span><br><span class="line">  int num = 10;</span><br><span class="line">&#125;</span><br><span class="line">------------- Defs -----------------</span><br><span class="line">def X0 &#123;        // A B</span><br><span class="line">  string fromA = &quot;From A&quot;;</span><br><span class="line">  int num = 10;</span><br><span class="line">&#125;</span><br><span class="line">def X1 &#123;        // A B</span><br><span class="line">  string fromA = &quot;From A&quot;;</span><br><span class="line">  int num = 10;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>record的member必须有类型， 内建类型有<code>int, string, bits, bits&lt;size&gt;, list&lt;type&gt;, dag</code>，自定义类型就是class了</p><h2 id="multiclass-defm"><a href="#multiclass-defm" class="headerlink" title="multiclass,defm"></a>multiclass,defm</h2><p>multiclass就是多个record，而不是类型。defm一次定义多个record。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">class Inst&lt;string n, int p&gt;&#123;</span><br><span class="line">    string name = n;</span><br><span class="line">    int price = p;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">multiclass Bundle&lt;string base&gt; &#123;</span><br><span class="line">def A: Inst&lt;!strconcat(base, &quot;-&quot;, &quot;A&quot;),1 &gt;;</span><br><span class="line">def B: Inst&lt;!strconcat(base, &quot;-&quot;, &quot;B&quot;),2 &gt;;</span><br><span class="line">def C &#123;</span><br><span class="line">string name = !strconcat(base, &quot;-&quot;, &quot;C&quot;);</span><br><span class="line">string tag = &quot;special&quot;;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">class ShippingPrice&lt;int arg&gt; &#123;</span><br><span class="line">int shippingPrice = arg;</span><br><span class="line">&#125;</span><br><span class="line">defm valuedBundle : Bundle&lt;&quot;valued&quot;&gt;, ShippingPrice&lt;5&gt;;</span><br><span class="line"></span><br><span class="line">def AnotherRecord &#123;</span><br><span class="line">list&lt;Inst&gt; gifts = [valuedBundleA, valuedBundleB];</span><br><span class="line">list&lt;ShippingPrice&gt; ps = [valuedBundleA, valuedBundleB];</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">------------- Classes -----------------</span><br><span class="line">class Inst&lt;string Inst:n = ?, int Inst:p = ?&gt; &#123;</span><br><span class="line">  string name = Inst:n;</span><br><span class="line">  int price = Inst:p;</span><br><span class="line">&#125;</span><br><span class="line">class ShippingPrice&lt;int ShippingPrice:arg = ?&gt; &#123;</span><br><span class="line">  int shippingPrice = ShippingPrice:arg;</span><br><span class="line">&#125;</span><br><span class="line">------------- Defs -----------------</span><br><span class="line">def AnotherRecord &#123;</span><br><span class="line">  list&lt;Inst&gt; gifts = [valuedBundleA, valuedBundleB];</span><br><span class="line">  list&lt;ShippingPrice&gt; ps = [valuedBundleA, valuedBundleB];</span><br><span class="line">&#125;</span><br><span class="line">def valuedBundleA &#123;     // Inst ShippingPrice</span><br><span class="line">  string name = &quot;valued-A&quot;;</span><br><span class="line">  int price = 1;</span><br><span class="line">  int shippingPrice = 5;</span><br><span class="line">&#125;</span><br><span class="line">def valuedBundleB &#123;     // Inst ShippingPrice</span><br><span class="line">  string name = &quot;valued-B&quot;;</span><br><span class="line">  int price = 2;</span><br><span class="line">  int shippingPrice = 5;</span><br><span class="line">&#125;</span><br><span class="line">def valuedBundleC &#123;     // ShippingPrice</span><br><span class="line">  string name = &quot;valued-C&quot;;</span><br><span class="line">  string tag = &quot;special&quot;;</span><br><span class="line">  int shippingPrice = 5;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="let绑定"><a href="#let绑定" class="headerlink" title="let绑定"></a>let绑定</h2><p>注意顺序。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">class MyClass&lt;string _alias=&quot;&quot;&gt; &#123;</span><br><span class="line">string alias = _alias;</span><br><span class="line">&#125;</span><br><span class="line">let alias = &quot;let from out&quot; in</span><br><span class="line">def A: MyClass&lt;&gt; &#123;&#125;</span><br><span class="line">def B: MyClass&lt;&gt; &#123;</span><br><span class="line">let alias = &quot;let from body&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def C: MyClass&lt;&quot;from arg&quot;&gt;;</span><br><span class="line">let alias = &quot;alias from bigger scope&quot; in &#123;</span><br><span class="line">let alias = &quot;let from out&quot; in</span><br><span class="line">def D: MyClass&lt;&quot;from arg&quot;&gt; &#123;</span><br><span class="line">let alias = &quot;let from body&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def E: MyClass&lt;&quot;will be overridden&quot;&gt;;</span><br><span class="line">&#125; // end &quot;alias from bigger scope&quot;</span><br><span class="line"></span><br><span class="line">def F:MyClass&lt;&quot;from arg&quot;&gt;&#123;</span><br><span class="line">    let alias = &quot;let From body&quot;;</span><br><span class="line">&#125;</span><br><span class="line">let alias = &quot;let from Out&quot; in</span><br><span class="line">    def G:MyClass&lt;&quot;from arg&quot;&gt;&#123;&#125;</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line">------------- Classes -----------------</span><br><span class="line">class MyClass&lt;string MyClass:_alias = &quot;&quot;&gt; &#123;</span><br><span class="line">  string alias = MyClass:_alias;</span><br><span class="line">&#125;</span><br><span class="line">------------- Defs -----------------</span><br><span class="line">def A &#123; // MyClass</span><br><span class="line">  string alias = &quot;let from out&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def B &#123; // MyClass</span><br><span class="line">  string alias = &quot;let from body&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def C &#123; // MyClass</span><br><span class="line">  string alias = &quot;from arg&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def D &#123; // MyClass</span><br><span class="line">  string alias = &quot;let from body&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def E &#123; // MyClass</span><br><span class="line">  string alias = &quot;alias from bigger scope&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def F &#123; // MyClass</span><br><span class="line">  string alias = &quot;let From body&quot;;</span><br><span class="line">&#125;</span><br><span class="line">def G &#123; // MyClass</span><br><span class="line">  string alias = &quot;let from Out&quot;;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h1 id="2-Backend"><a href="#2-Backend" class="headerlink" title="2. Backend"></a>2. Backend</h1><p>clang-tblgen, llvm-tblgen, mlir-tblgen</p>]]>
    </content>
    <id>http://example.com/2026/04/22/llvm/tablegen/</id>
    <link href="http://example.com/2026/04/22/llvm/tablegen/"/>
    <published>2026-04-22T00:00:00.000Z</published>
    <summary>
      <![CDATA[<h1 id="0"><a href="#0" class="headerlink" title="0"></a>0</h1><ol>
<li>构建llvm-tblgen， cmake configure后 <code>cd build &amp;&amp; ninja llvm]]>
    </summary>
    <title>tablegen</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<ul><li>线程组织 grid，block，thread。<br>  一个kernel启动的所有线程称为grid。block是调度单元。<code>&lt;&lt;&lt;网格大小，线程块大小&gt;&gt;&gt;</code></li><li>实际硬件：GPU，SM （stream multiprocessor），SP。<br>  SM： local register file + shared memory+L1 cache+ a number of functional units that perform computations<br>  一个block的所有thread都在一个SM上执行。</li></ul><p>CUDA要求以任意顺序执行blocks，即一个thread block不应该依赖其他block（同grid）。</p><ul><li><p>warp<br>block内部，threads以32个为一warp组织起来。warp是以SIMT模型执行kernel。该warp的线程执行相同的kernel code，但可以有不同branch。<br>When different threads in a warp follow different code paths, this is sometimes called warp divergence。 </p></li><li><p>内存层次</p><ul><li>DRAM，global memory 可被所有块所有线程访问， HBM<br>  coalesced global memory access：<br>  global memory以32bytes为单位访问称为memory transaction。<br>  warp会将内部线程的内存访问映射为memory transactions。</li><li>同一个block内的线程访问，on chip memory，每个SM独占的。<ul><li><p>shared memory<br>  32banks x 32bits&#x2F;cycle<br>  bank confilct：multi threads in the same warp attempt to access different elements in the same bank.</p></li><li><p>L1 cache</p></li><li><p>register file<br>  如果某个thread block需要的寄存器大于寄存器个数，该kernel不可启动</p></li><li><p>常量内存</p></li></ul></li><li>local memory 每个thread独有的。实际上在global memory上面。<ul><li>特别是register spilling</li></ul></li></ul></li><li><p>thread hierarchy</p></li></ul><blockquote><p>The threads of a block are linearized predictably: the first index x moves the fastest,<br>followed by y and then z. This means that in the linearization of a thread indices, consecutive values of threadIdx.x indicate consecutive threads, threadIdx.y has a stride of blockDim.x, and threadIdx.z has a stride of blockDim.x * blockDim.y. </p></blockquote><p><code>linear_id = threadIdx.x + threadIdx.y * blockDim.x + threadId.z * blockDim.x * blockDim.y</code></p><p>从threadid到 memory address的方式是自己选择的。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">__global__ <span class="type">void</span> <span class="title">transpose_shared</span><span class="params">(<span class="type">float</span>* out, <span class="type">const</span> <span class="type">float</span>* in, <span class="type">int</span> w, <span class="type">int</span> h)</span> </span>&#123;</span><br><span class="line"></span><br><span class="line">    __shared__ <span class="type">float</span> tile[TILE][TILE + <span class="number">1</span>];  <span class="comment">// +1 避免 bank conflict</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// ===== 1. 全局坐标 =====</span></span><br><span class="line">    <span class="type">int</span> x = blockIdx.x * TILE + threadIdx.x;</span><br><span class="line">    <span class="type">int</span> y = blockIdx.y * TILE + threadIdx.y;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ===== 2. global → shared（连续读）=====</span></span><br><span class="line">    <span class="keyword">if</span> (x &lt; w &amp;&amp; y &lt; h) &#123;</span><br><span class="line">        tile[threadIdx.y][threadIdx.x] = in[y * w + x];</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    __syncthreads();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ===== 3. 交换 blockIdx（关键！！！）=====</span></span><br><span class="line">    <span class="type">int</span> new_x = blockIdx.y * TILE + threadIdx.x;</span><br><span class="line">    <span class="type">int</span> new_y = blockIdx.x * TILE + threadIdx.y;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ===== 4. shared → global（连续写）=====</span></span><br><span class="line">    <span class="keyword">if</span> (new_x &lt; h &amp;&amp; new_y &lt; w) &#123;</span><br><span class="line">        out[new_y * h + new_x] = tile[threadIdx.x][threadIdx.y];</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>]]>
    </content>
    <id>http://example.com/2026/04/13/gpu_cuda/cuda_basic/</id>
    <link href="http://example.com/2026/04/13/gpu_cuda/cuda_basic/"/>
    <published>2026-04-13T00:00:00.000Z</published>
    <summary>
      <![CDATA[<ul>
<li>线程组织 grid，block，thread。<br>  一个kernel启动的所有线程称为grid。block是调度单元。<code>&lt;&lt;&lt;网格大小，线程块大小&gt;&gt;&gt;</code></li>
<li>实际硬件：GPU，SM]]>
    </summary>
    <title>[WIP]基础cuda</title>
    <updated>2026-04-28T13:16:41.645Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<blockquote><p>学习下mlir</p></blockquote><h1 id="1-基本概念"><a href="#1-基本概念" class="headerlink" title="1. 基本概念"></a>1. 基本概念</h1><ul><li>树形结构。Op，Region，Block，Op</li><li>使用基本块参数代替phi</li><li>Operand结构<ol><li>操作</li><li>返回值：OpResult</li><li>regions</li><li>attrDict，属性字典</li><li>参数：OpOperand, 经典的Use结构</li></ol></li><li>Value是ValueImpl*的包装，type+kind</li><li>Type, TypeStorage* 实际上是AbstractType* &#x3D; {dialect, interface, typeid,name, subtypes}</li><li>interface vs trait<ul><li>interface想要语义描述和具体Op&#x2F;dialect解耦, 给pass提供api<br>  使用主要是   <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Dialect *dialect = ...;</span><br><span class="line"><span class="keyword">if</span> (DialectInlinerInterface *interface = <span class="built_in">dyn_cast</span>&lt;DialectInlinerInterface&gt;(dialect)) &#123;</span><br><span class="line"><span class="comment">// The dialect has provided an implementation of this interface.</span></span><br><span class="line">...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li>trait 更加静态，为attrs&#x2F;ops&#x2F;types 提供公共信息，比如说side effects<br>  <code>Operation *op = ..; if (op-&gt;hasTrait&lt;MyTrait&gt;()) ...</code></li></ul></li></ul><h1 id="2-添加dialect"><a href="#2-添加dialect" class="headerlink" title="2. 添加dialect"></a>2. 添加dialect</h1><h1 id="3-添加pass"><a href="#3-添加pass" class="headerlink" title="3. 添加pass"></a>3. 添加pass</h1><h1 id="4-pattern-rewriter"><a href="#4-pattern-rewriter" class="headerlink" title="4. pattern rewriter"></a>4. pattern rewriter</h1><h1 id="5-converter"><a href="#5-converter" class="headerlink" title="5. converter"></a>5. converter</h1><h1 id="ref"><a href="#ref" class="headerlink" title="ref"></a>ref</h1><ul><li><a href="https://github.com/KEKE046/mlir-tutorial">https://github.com/KEKE046/mlir-tutorial</a></li></ul>]]>
    </content>
    <id>http://example.com/2026/04/13/llvm/mlir/</id>
    <link href="http://example.com/2026/04/13/llvm/mlir/"/>
    <published>2026-04-13T00:00:00.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>学习下mlir</p>
</blockquote>
<h1 id="1-基本概念"><a href="#1-基本概念" class="headerlink" title="1. 基本概念"></a>1. 基本概念</h1><ul>
<li>树形结构]]>
    </summary>
    <title>[WIP]mlir</title>
    <updated>2026-04-28T13:16:41.647Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>以mac air构建为例，耗时26min。</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">cmake -G Ninja \</span><br><span class="line">    -DCMAKE_EXPORT_COMPILE_COMMANDS=ON  \</span><br><span class="line">    -S ./llvm -B build \</span><br><span class="line">    -DCMAKE_BUILD_TYPE=RELWITHDEBINFO \</span><br><span class="line">    -DLLVM_BUILD_EXAMPLES=ON \</span><br><span class="line">    -DLLVM_ENABLE_PROJECTS=<span class="string">&quot;mlir&quot;</span> \</span><br><span class="line">    -DLLVM_TARGETS_TO_BUILD=<span class="string">&quot;Native;NVPTX&quot;</span> \</span><br><span class="line">    -DLLVM_ENABLE_ASSERTIONS=ON</span><br><span class="line"></span><br><span class="line"><span class="built_in">cd</span> build &amp;&amp; ninja</span><br><span class="line"></span><br></pre></td></tr></table></figure><ul><li>把mlir当作lib使用</li></ul><ol><li>需要构建mlir</li><li>cmake配置</li></ol><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="built_in">set</span> -x</span><br><span class="line"></span><br><span class="line"><span class="built_in">mkdir</span> -p build</span><br><span class="line"><span class="built_in">cd</span> build</span><br><span class="line">cmake .. -GNinja -DCMAKE_INSTALL_PREFIX=/&lt;mlir_path&gt;/install  -DCMAKE_PREFIX_PATH=../build</span><br><span class="line">ninja</span><br><span class="line"></span><br></pre></td></tr></table></figure>]]>
    </content>
    <id>http://example.com/2026/04/13/llvm/mlir_build/</id>
    <link href="http://example.com/2026/04/13/llvm/mlir_build/"/>
    <published>2026-04-13T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>以mac air构建为例，耗时26min。</p>
<figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="lin]]>
    </summary>
    <title>mlir构建</title>
    <updated>2026-04-28T13:16:41.647Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<blockquote><p>参考 <a href="https://cs.wheaton.edu/~tvandrun/writings/cc04.pdf">https://cs.wheaton.edu/~tvandrun/writings/cc04.pdf</a><br>只考虑表达式，基于SSA</p></blockquote><p>GVN是将Global value numbering，而PRE是Partial redundancy elimination。一个是值编号，一个是部分冗余消除。</p><p>GVN的思路简单，按照domtree进行深度优先遍历，记录每个block中表达式的编号，如果重复就删除重复的。<br>对如下ir</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">define i32 @foo(i32 %a, i32 %cond)&#123;</span><br><span class="line">bb0:</span><br><span class="line">    %add = add i32 %a, %a</span><br><span class="line">    %c = icmp eq %cond, 0</span><br><span class="line">    br %c, %bb1, %bb2</span><br><span class="line">bb1:</span><br><span class="line">    %a2 = add i32 %a, %a</span><br><span class="line">    br %bb3</span><br><span class="line"></span><br><span class="line">bb2:</span><br><span class="line">    %a3 = add i32 %a, 1</span><br><span class="line">    br %bb3</span><br><span class="line">bb3:</span><br><span class="line">    %phia = phi(%a2, %a3)</span><br><span class="line">    ret i32 %phia</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>按照domtree进行遍历，bb0,bb1,bb2,bb3，有以下结果。<br>%a: v0<br>%cond: v1<br>%add: v2<br>%c:  v3<br>%a2: v2 (same as %add)<br>%a3: v4<br>%phia: v5</p><p>所以说%a2是冗余的值，可以删除掉。</p><hr><p>PRE更加复杂一些。</p><p>在这个例子2里面，%sa和%a2重复，但bb1不dom bb3. 单纯GVN没法处理。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">define i32 @bar(i32 %a, i32 %cond)&#123;</span><br><span class="line">bb0:</span><br><span class="line">    %c = icmp eq %cond, 0</span><br><span class="line">    br %c, %bb1, %bb2</span><br><span class="line">bb1:</span><br><span class="line">    %a2 = add i32 %a, %a</span><br><span class="line">    br %bb3</span><br><span class="line"></span><br><span class="line">bb2:</span><br><span class="line">    %a3 = add i32 %a, 3</span><br><span class="line">    br %bb3</span><br><span class="line">bb3:</span><br><span class="line">    %phia = phi(%a2, %a3)</span><br><span class="line">    %sa = add i32 %a, %a</span><br><span class="line">    ret i32 %phia</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>PRE的算法有些复杂，这里先不介绍了。</p><h1 id="GVN-PRE"><a href="#GVN-PRE" class="headerlink" title="GVN-PRE"></a>GVN-PRE</h1><p>结合二者的优点，GVN简单，PRE能力强。<br>针对上面的例子2，如果用逆向数据流分析，将%sa从bb3放到bb0底部，那么也就能消除%a2了。</p><p>不过首先需要确定可用集合。就是在每个block最后，有哪些表达式可用。<br>还是按照GVN的思路，根据domtree遍历。<br>所以数据流公式就是</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">AVAIL_IN[b] = AVAIL_OUT[dom(b)]</span><br><span class="line">AVAIL_OUT[b] = AVAIL_IN[b] U PHI_GEN[b] U TMP_EXP[b]</span><br><span class="line"></span><br><span class="line">PHI_GEN就是phinode，TMP_EXP是b中定义的非phi节点。</span><br></pre></td></tr></table></figure><p>然后因为需要逆向数据流分析（回想下liveness），所以</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">ANTIC_OUT[b] = </span><br><span class="line">    - for i in succ(b) &#123; ∧ ANTIC_IN[i] &#125; if |succ(b)| &gt; 1</span><br><span class="line">    - phi_translate( ANTIC_IN[succ(b)] ) if |succ(b)| =1</span><br><span class="line"></span><br><span class="line">ANTIC_IN[b] = clean(ANTIC_OUT[b] U EXP_GEN[b] - TMP_GEN[b])</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">EXP_GEN就是value和指令右侧表达式的集合。</span><br><span class="line">clean会删除掉依赖被killed的值</span><br></pre></td></tr></table></figure><p>所以算法就是</p><ol><li>计算avail和antic集合</li><li>处理hoist<br> for b in blocks{<br>     if preds(b) &lt; 2<br>         continue<br>     for e in ANTIC_IN[b]:<br>         if e is avaiable in at least one predcessor, then insert e into predecessors where not avaiable.<br> }<br> (还有sink和hoist相反，原理类似)</li><li>删除冗余表达式</li></ol>]]>
    </content>
    <id>http://example.com/2026/04/08/gvn-pre/</id>
    <link href="http://example.com/2026/04/08/gvn-pre/"/>
    <published>2026-04-08T00:00:00.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>参考 <a href="https://cs.wheaton.edu/~tvandrun/writings/cc04.pdf">https://cs.wheaton.edu/~tvandrun/writings/cc04.pdf</a><br>只考]]>
    </summary>
    <title>GVN-PRE简介</title>
    <updated>2026-04-28T13:16:41.646Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<h1 id="greedy"><a href="#greedy" class="headerlink" title="greedy"></a>greedy</h1><p>TODO</p>]]>
    </content>
    <id>http://example.com/2026/03/22/llvm/regalloc_llvm2/</id>
    <link href="http://example.com/2026/03/22/llvm/regalloc_llvm2/"/>
    <published>2026-03-22T00:00:00.000Z</published>
    <summary>
      <![CDATA[<h1 id="greedy"><a href="#greedy" class="headerlink" title="greedy"></a>greedy</h1><p>TODO</p>]]>
    </summary>
    <title>llvm寄存器分配2</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>接上一篇。<br>列举下RABasic需要的数据结构</p><ul><li><p>LiveRegMatrix， LiveIntervals， VirtRegMap， LiveIntervalUnion</p></li><li><p>LiveIntervals</p></li></ul><p>核心的查询接口。</p><ol><li>VirtRegIntervals <code>Map&lt;reg, LiveIntervals&gt; </code></li><li>RegMaskSlots 是用来储存哪些指令用了regmask（基本是call），在判断干涉时用到，按照slotindex排序</li><li>RegMaskBits，查询的cache</li><li>RegMaskBlocks： <code>Map&lt;block, {begin, size}&gt;</code> 这里的begin是RegMaskSlots的下标，即某个block对应哪些使用regmask的指令</li></ol><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">LiveIntervals</span> &#123;</span><br><span class="line">  MachineFunction *MF = <span class="literal">nullptr</span>;</span><br><span class="line">  MachineRegisterInfo *MRI = <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="type">const</span> TargetRegisterInfo *TRI = <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="type">const</span> TargetInstrInfo *TII = <span class="literal">nullptr</span>;</span><br><span class="line">  SlotIndexes *Indexes = <span class="literal">nullptr</span>;</span><br><span class="line">  MachineDominatorTree *DomTree = <span class="literal">nullptr</span>;</span><br><span class="line">  std::unique_ptr&lt;LiveIntervalCalc&gt; LICalc;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Live interval pointers for all the virtual registers.</span></span><br><span class="line">  IndexedMap&lt;LiveInterval *, VirtReg2IndexFunctor&gt; VirtRegIntervals;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Sorted list of instructions with register mask operands. Always use the</span></span><br><span class="line">  <span class="comment">/// &#x27;r&#x27; slot, RegMasks are normal clobbers, not early clobbers.</span></span><br><span class="line">  SmallVector&lt;SlotIndex, <span class="number">8</span>&gt; RegMaskSlots;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// This vector is parallel to RegMaskSlots, it holds a pointer to the</span></span><br><span class="line">  <span class="comment">/// corresponding register mask.  This pointer can be recomputed as:</span></span><br><span class="line">  <span class="comment">///</span></span><br><span class="line">  <span class="comment">///   MI = Indexes-&gt;getInstructionFromIndex(RegMaskSlot[N]);</span></span><br><span class="line">  <span class="comment">///   unsigned OpNum = findRegMaskOperand(MI);</span></span><br><span class="line">  <span class="comment">///   RegMaskBits[N] = MI-&gt;getOperand(OpNum).getRegMask();</span></span><br><span class="line">  <span class="comment">///</span></span><br><span class="line">  <span class="comment">/// This is kept in a separate vector partly because some standard</span></span><br><span class="line">  <span class="comment">/// libraries don&#x27;t support lower_bound() with mixed objects, partly to</span></span><br><span class="line">  <span class="comment">/// improve locality when searching in RegMaskSlots.</span></span><br><span class="line">  <span class="comment">/// Also see the comment in LiveInterval::find().</span></span><br><span class="line">  SmallVector&lt;<span class="type">const</span> <span class="type">uint32_t</span> *, <span class="number">8</span>&gt; RegMaskBits;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// For each basic block number, keep (begin, size) pairs indexing into the</span></span><br><span class="line">  <span class="comment">/// RegMaskSlots and RegMaskBits arrays.</span></span><br><span class="line">  <span class="comment">/// Note that basic block numbers may not be layout contiguous, that&#x27;s why</span></span><br><span class="line">  <span class="comment">/// we can&#x27;t just keep track of the first register mask in each basic</span></span><br><span class="line">  <span class="comment">/// block.</span></span><br><span class="line">  SmallVector&lt;std::pair&lt;<span class="type">unsigned</span>, <span class="type">unsigned</span>&gt;, <span class="number">8</span>&gt; RegMaskBlocks;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Keeps a live range set for each register unit to track fixed physreg</span></span><br><span class="line">  <span class="comment">/// interference.</span></span><br><span class="line">  SmallVector&lt;LiveRange *, <span class="number">0</span>&gt; RegUnitRanges;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><ul><li>LiveIntervalUnion， llvm&#x2F;include&#x2F;llvm&#x2F;CodeGen&#x2F;LiveIntervalUnion.h<br>IntervalMap的包装类。</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">LiveIntervalUnion</span>&#123;</span><br><span class="line"></span><br><span class="line">  IntervalMap&lt;SlotIndex, <span class="type">const</span> LiveInterval *&gt; Segments;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">class</span> <span class="title class_">Array</span> &#123;</span><br><span class="line">    <span class="type">unsigned</span> Size = <span class="number">0</span>;</span><br><span class="line">    LiveIntervalUnion *LIUs = <span class="literal">nullptr</span>;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br></pre></td></tr></table></figure><ul><li>LiveRegMatrix</li></ul><p><code>llvm/include/llvm/CodeGen/LiveRegMatrix.h</code></p><p>用来追踪虚拟寄存器的干涉情况，是一个二维结构， <code>slot index</code> x <code>reg units</code>.</p><ul><li>用<code>LiveIntervalUnion::Array Matrix;</code>表示这个二维结构.</li><li><code>InterferenceKind</code>作为干涉查询结果。<code>InterferenceKind checkInterference(const LiveInterval &amp;VirtReg, MCRegister PhysReg)</code><ul><li>IK_Free 没干涉</li><li>IK_VirtReg， 虚拟寄存器和物理寄存器干涉。</li><li>IK_RegUnit, 不太清楚</li><li>IK_RegMask，主要是call里面的regmask</li></ul></li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">LiveRegMatrix</span> : <span class="keyword">public</span> MachineFunctionPass &#123;</span><br><span class="line">  <span class="type">const</span> TargetRegisterInfo *TRI = <span class="literal">nullptr</span>;</span><br><span class="line">  LiveIntervals *LIS = <span class="literal">nullptr</span>;</span><br><span class="line">  VirtRegMap *VRM = <span class="literal">nullptr</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// UserTag changes whenever virtual registers have been modified.</span></span><br><span class="line">  <span class="type">unsigned</span> UserTag = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// The matrix is represented as a LiveIntervalUnion per register unit.</span></span><br><span class="line">  LiveIntervalUnion::Array Matrix;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Cached queries per register unit.</span></span><br><span class="line">  std::unique_ptr&lt;LiveIntervalUnion::Query[]&gt; Queries;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Cached register mask interference info.</span></span><br><span class="line">  <span class="type">unsigned</span> RegMaskTag = <span class="number">0</span>;</span><br><span class="line">  <span class="type">unsigned</span> RegMaskVirtReg = <span class="number">0</span>;</span><br><span class="line">  BitVector RegMaskUsable;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">  <span class="keyword">enum</span> <span class="title class_">InterferenceKind</span> &#123;</span><br><span class="line">    <span class="comment">/// No interference, go ahead and assign.</span></span><br><span class="line">    IK_Free = <span class="number">0</span>,</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Virtual register interference. There are interfering virtual registers</span></span><br><span class="line">    <span class="comment">/// assigned to PhysReg or its aliases. This interference could be resolved</span></span><br><span class="line">    <span class="comment">/// by unassigning those other virtual registers.</span></span><br><span class="line">    IK_VirtReg,</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Register unit interference. A fixed live range is in the way, typically</span></span><br><span class="line">    <span class="comment">/// argument registers for a call. This can&#x27;t be resolved by unassigning</span></span><br><span class="line">    <span class="comment">/// other virtual registers.</span></span><br><span class="line">    IK_RegUnit,</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// RegMask interference. The live range is crossing an instruction with a</span></span><br><span class="line">    <span class="comment">/// regmask operand that doesn&#x27;t preserve PhysReg. This typically means</span></span><br><span class="line">    <span class="comment">/// VirtReg is live across a call, and PhysReg isn&#x27;t call-preserved.</span></span><br><span class="line">    IK_RegMask</span><br><span class="line">  &#125;;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Check for interference before assigning VirtReg to PhysReg.</span></span><br><span class="line">  <span class="comment">/// If this function returns IK_Free, it is legal to assign(VirtReg, PhysReg).</span></span><br><span class="line">  <span class="comment">/// When there is more than one kind of interference, the InterferenceKind</span></span><br><span class="line">  <span class="comment">/// with the highest enum value is returned.</span></span><br><span class="line">  <span class="function">InterferenceKind <span class="title">checkInterference</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg,</span></span></span><br><span class="line"><span class="params"><span class="function">                                     MCRegister PhysReg)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Check for interference in the segment [Start, End) that may prevent</span></span><br><span class="line">  <span class="comment">/// assignment to PhysReg. If this function returns true, there is</span></span><br><span class="line">  <span class="comment">/// interference in the segment [Start, End) of some other interval already</span></span><br><span class="line">  <span class="comment">/// assigned to PhysReg. If this function returns false, PhysReg is free at</span></span><br><span class="line">  <span class="comment">/// the segment [Start, End).</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">checkInterference</span><span class="params">(SlotIndex Start, SlotIndex End, MCRegister PhysReg)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Assign VirtReg to PhysReg.</span></span><br><span class="line">  <span class="comment">/// This will mark VirtReg&#x27;s live range as occupied in the LiveRegMatrix and</span></span><br><span class="line">  <span class="comment">/// update VirtRegMap. The live range is expected to be available in PhysReg.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">assign</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg, MCRegister PhysReg)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Unassign VirtReg from its PhysReg.</span></span><br><span class="line">  <span class="comment">/// Assuming that VirtReg was previously assigned to a PhysReg, this undoes</span></span><br><span class="line">  <span class="comment">/// the assignment and updates VirtRegMap accordingly.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">unassign</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Returns true if the given \p PhysReg has any live intervals assigned.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">isPhysRegUsed</span><span class="params">(MCRegister PhysReg)</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">//===--------------------------------------------------------------------===//</span></span><br><span class="line">  <span class="comment">// Low-level interface.</span></span><br><span class="line">  <span class="comment">//===--------------------------------------------------------------------===//</span></span><br><span class="line">  <span class="comment">//</span></span><br><span class="line">  <span class="comment">// Provide access to the underlying LiveIntervalUnions.</span></span><br><span class="line">  <span class="comment">//</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Check for regmask interference only.</span></span><br><span class="line">  <span class="comment">/// Return true if VirtReg crosses a regmask operand that clobbers PhysReg.</span></span><br><span class="line">  <span class="comment">/// If PhysReg is null, check if VirtReg crosses any regmask operands.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">checkRegMaskInterference</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg,</span></span></span><br><span class="line"><span class="params"><span class="function">                                MCRegister PhysReg = MCRegister::NoRegister)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Check for regunit interference only.</span></span><br><span class="line">  <span class="comment">/// Return true if VirtReg overlaps a fixed assignment of one of PhysRegs&#x27;s</span></span><br><span class="line">  <span class="comment">/// register units.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">checkRegUnitInterference</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg,</span></span></span><br><span class="line"><span class="params"><span class="function">                                MCRegister PhysReg)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Query a line of the assigned virtual register matrix directly.</span></span><br><span class="line">  <span class="comment">/// Use MCRegUnitIterator to enumerate all regunits in the desired PhysReg.</span></span><br><span class="line">  <span class="comment">/// This returns a reference to an internal Query data structure that is only</span></span><br><span class="line">  <span class="comment">/// valid until the next query() call.</span></span><br><span class="line">  <span class="function">LiveIntervalUnion::Query &amp;<span class="title">query</span><span class="params">(<span class="type">const</span> LiveRange &amp;LR, MCRegister RegUnit)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Directly access the live interval unions per regunit.</span></span><br><span class="line">  <span class="comment">/// This returns an array indexed by the regunit number.</span></span><br><span class="line">  <span class="function">LiveIntervalUnion *<span class="title">getLiveUnions</span><span class="params">()</span> </span>&#123; <span class="keyword">return</span> &amp;Matrix[<span class="number">0</span>]; &#125;</span><br><span class="line"></span><br><span class="line">  <span class="function">Register <span class="title">getOneVReg</span><span class="params">(<span class="type">unsigned</span> PhysReg)</span> <span class="type">const</span></span>;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br></pre></td></tr></table></figure><ul><li>vregmap, rewriter时候使用<br>即寄存器分配的内容不是立即改写mir，而是在rewriter pass时候才会改写mir。</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">VirtRegMap</span> : <span class="keyword">public</span> MachineFunctionPass &#123;</span><br><span class="line">  MachineRegisterInfo *MRI = <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="type">const</span> TargetInstrInfo *TII = <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="type">const</span> TargetRegisterInfo *TRI = <span class="literal">nullptr</span>;</span><br><span class="line">  MachineFunction *MF = <span class="literal">nullptr</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Virt2PhysMap - This is a virtual to physical register</span></span><br><span class="line">  <span class="comment">/// mapping. Each virtual register is required to have an entry in</span></span><br><span class="line">  <span class="comment">/// it; even spilled virtual registers (the register mapped to a</span></span><br><span class="line">  <span class="comment">/// spilled register is the temporary used to load it from the</span></span><br><span class="line">  <span class="comment">/// stack).</span></span><br><span class="line">  IndexedMap&lt;MCRegister, VirtReg2IndexFunctor&gt; Virt2PhysMap;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Virt2StackSlotMap - This is virtual register to stack slot</span></span><br><span class="line">  <span class="comment">/// mapping. Each spilled virtual register has an entry in it</span></span><br><span class="line">  <span class="comment">/// which corresponds to the stack slot this register is spilled</span></span><br><span class="line">  <span class="comment">/// at.</span></span><br><span class="line">  IndexedMap&lt;<span class="type">int</span>, VirtReg2IndexFunctor&gt; Virt2StackSlotMap;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Virt2SplitMap - This is virtual register to splitted virtual register</span></span><br><span class="line">  <span class="comment">/// mapping.</span></span><br><span class="line">  IndexedMap&lt;Register, VirtReg2IndexFunctor&gt; Virt2SplitMap;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// Virt2ShapeMap - For X86 AMX register whose register is bound shape</span></span><br><span class="line">  <span class="comment">/// information.</span></span><br><span class="line">  DenseMap&lt;Register, ShapeT&gt; Virt2ShapeMap;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h1 id="regalloc-算法实现"><a href="#regalloc-算法实现" class="headerlink" title="regalloc 算法实现"></a>regalloc 算法实现</h1><h2 id="regalloc-base"><a href="#regalloc-base" class="headerlink" title="regalloc base"></a>regalloc base</h2><p>llvm&#x2F;lib&#x2F;CodeGen&#x2F;RegAllocBase.cpp</p><p>线性扫描+优先队列的思路.<br>该文件里定义了一系列接口<br>RegAllocBase 核心是: enqueueImpl, dequeue, selectOrSplit</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">RegAllocBase</span> &#123;</span><br><span class="line">  <span class="function"><span class="keyword">virtual</span> <span class="type">void</span> <span class="title">anchor</span><span class="params">()</span></span>;</span><br><span class="line"><span class="keyword">protected</span>:</span><br><span class="line">  <span class="type">const</span> TargetRegisterInfo *TRI = <span class="literal">nullptr</span>;</span><br><span class="line">  MachineRegisterInfo *MRI = <span class="literal">nullptr</span>;</span><br><span class="line">  VirtRegMap *VRM = <span class="literal">nullptr</span>;</span><br><span class="line">  LiveIntervals *LIS = <span class="literal">nullptr</span>;</span><br><span class="line">  LiveRegMatrix *Matrix = <span class="literal">nullptr</span>;</span><br><span class="line">  RegisterClassInfo RegClassInfo;</span><br><span class="line"></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">  <span class="comment">/// Private, callees should go through shouldAllocateRegister</span></span><br><span class="line">  <span class="type">const</span> RegAllocFilterFunc shouldAllocateRegisterImpl;</span><br><span class="line"><span class="keyword">protected</span>:</span><br><span class="line">  <span class="comment">/// Inst which is a def of an original reg and whose defs are already all</span></span><br><span class="line">  <span class="comment">/// dead after remat is saved in DeadRemats. The deletion of such inst is</span></span><br><span class="line">  <span class="comment">/// postponed till all the allocations are done, so its remat expr is</span></span><br><span class="line">  <span class="comment">/// always available for the remat of all the siblings of the original reg.</span></span><br><span class="line">  SmallPtrSet&lt;MachineInstr *, <span class="number">32</span>&gt; DeadRemats;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">  <span class="comment">// The top-level driver. The output is a VirtRegMap that us updated with</span></span><br><span class="line">  <span class="comment">// physical register assignments.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">allocatePhysRegs</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Include spiller post optimization and removing dead defs left because of</span></span><br><span class="line">  <span class="comment">// rematerialization.</span></span><br><span class="line">  <span class="function"><span class="keyword">virtual</span> <span class="type">void</span> <span class="title">postOptimization</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Get a temporary reference to a Spiller instance.</span></span><br><span class="line">  <span class="function"><span class="keyword">virtual</span> Spiller &amp;<span class="title">spiller</span><span class="params">()</span> </span>= <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="keyword">virtual</span> <span class="type">void</span> <span class="title">enqueueImpl</span><span class="params">(<span class="type">const</span> LiveInterval *LI)</span> </span>= <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// enqueue - Add VirtReg to the priority queue of unassigned registers.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">enqueue</span><span class="params">(<span class="type">const</span> LiveInterval *LI)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/// dequeue - Return the next unassigned register, or NULL.</span></span><br><span class="line">  <span class="function"><span class="keyword">virtual</span> <span class="type">const</span> LiveInterval *<span class="title">dequeue</span><span class="params">()</span> </span>= <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="keyword">virtual</span> MCRegister <span class="title">selectOrSplit</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg,</span></span></span><br><span class="line"><span class="params"><span class="function">                                   SmallVectorImpl&lt;Register&gt; &amp;splitLVRs)</span> </span>= <span class="number">0</span>;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 结构精简后:</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">RegAllocBase::allocatePhysRegs</span><span class="params">()</span></span>&#123;</span><br><span class="line">    <span class="keyword">while</span> (<span class="type">const</span> LiveInterval *VirtReg = <span class="built_in">dequeue</span>()) &#123;</span><br><span class="line">        VirtRegVec SplitVRegs;</span><br><span class="line">        MCRegister AvailablePhysReg = <span class="built_in">selectOrSplit</span>(*VirtReg, SplitVRegs);</span><br><span class="line">        <span class="keyword">if</span>(AvailablePhysReg == ~<span class="number">0u</span>)&#123;</span><br><span class="line">            report_error</span><br><span class="line">        &#125;<span class="keyword">else</span> <span class="keyword">if</span> (AvailablePhysReg)&#123;</span><br><span class="line">            Matrix-&gt;<span class="built_in">assign</span>(*VirtReg, AvailablePhysReg);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">for</span> (Register Reg : SplitVRegs) &#123;</span><br><span class="line"></span><br><span class="line">            LiveInterval *SplitVirtReg = &amp;LIS-&gt;<span class="built_in">getInterval</span>(Reg);</span><br><span class="line"></span><br><span class="line">            <span class="built_in">enqueue</span>(SplitVirtReg);</span><br><span class="line">            ++NumNewQueued;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h2 id="RegAllocBasic"><a href="#RegAllocBasic" class="headerlink" title="RegAllocBasic"></a>RegAllocBasic</h2><p>llvm&#x2F;lib&#x2F;CodeGen&#x2F;RegAllocBasic.cpp<br>很简易的实现. 还是先看selectOrSplit的实现</p><ol><li>先检测有没有可用物理寄存器。 有就直接返回</li><li>没有则尝试将更低weight的已分配寄存器溢出</li><li>还是不行就将自己溢出。</li></ol><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">enqueueImpl</span><span class="params">(<span class="type">const</span> LiveInterval *LI)</span> <span class="keyword">override</span> </span>&#123; Queue.<span class="built_in">push</span>(LI); &#125;</span><br><span class="line"><span class="function"><span class="type">const</span> LiveInterval *<span class="title">dequeue</span><span class="params">()</span> <span class="keyword">override</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (Queue.<span class="built_in">empty</span>())</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="type">const</span> LiveInterval *LI = Queue.<span class="built_in">top</span>();</span><br><span class="line">  Queue.<span class="built_in">pop</span>();</span><br><span class="line">  <span class="keyword">return</span> LI;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">MCRegister <span class="title">RABasic::selectOrSplit</span><span class="params">(<span class="type">const</span> LiveInterval &amp;VirtReg,</span></span></span><br><span class="line"><span class="params"><span class="function">                                  SmallVectorImpl&lt;Register&gt; &amp;SplitVRegs)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// Populate a list of physical register spill candidates.</span></span><br><span class="line">  SmallVector&lt;MCRegister, <span class="number">8</span>&gt; PhysRegSpillCands;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Check for an available register in this class.</span></span><br><span class="line">  <span class="keyword">auto</span> Order =</span><br><span class="line">      AllocationOrder::<span class="built_in">create</span>(VirtReg.<span class="built_in">reg</span>(), *VRM, RegClassInfo, Matrix);</span><br><span class="line">  <span class="keyword">for</span> (MCRegister PhysReg : Order) &#123;</span><br><span class="line">    <span class="built_in">assert</span>(PhysReg.<span class="built_in">isValid</span>());</span><br><span class="line">    <span class="comment">// Check for interference in PhysReg</span></span><br><span class="line">    <span class="keyword">switch</span> (Matrix-&gt;<span class="built_in">checkInterference</span>(VirtReg, PhysReg)) &#123;</span><br><span class="line">    <span class="keyword">case</span> LiveRegMatrix::IK_Free:</span><br><span class="line">      <span class="comment">// PhysReg is available, allocate it.</span></span><br><span class="line">      <span class="keyword">return</span> PhysReg;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">case</span> LiveRegMatrix::IK_VirtReg:</span><br><span class="line">      <span class="comment">// Only virtual registers in the way, we may be able to spill them.</span></span><br><span class="line">      PhysRegSpillCands.<span class="built_in">push_back</span>(PhysReg);</span><br><span class="line">      <span class="keyword">continue</span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">      <span class="comment">// RegMask or RegUnit interference.</span></span><br><span class="line">      <span class="keyword">continue</span>;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Try to spill another interfering reg with less spill weight.</span></span><br><span class="line">  <span class="keyword">for</span> (MCRegister &amp;PhysReg : PhysRegSpillCands) &#123;</span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">spillInterferences</span>(VirtReg, PhysReg, SplitVRegs))</span><br><span class="line">      <span class="keyword">continue</span>;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span>(!Matrix-&gt;<span class="built_in">checkInterference</span>(VirtReg, PhysReg) &amp;&amp;</span><br><span class="line">           <span class="string">&quot;Interference after spill.&quot;</span>);</span><br><span class="line">    <span class="comment">// Tell the caller to allocate to this newly freed physical register.</span></span><br><span class="line">    <span class="keyword">return</span> PhysReg;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// No other spill candidates were found, so spill the current VirtReg.</span></span><br><span class="line">  <span class="built_in">LLVM_DEBUG</span>(<span class="built_in">dbgs</span>() &lt;&lt; <span class="string">&quot;spilling: &quot;</span> &lt;&lt; VirtReg &lt;&lt; <span class="string">&#x27;\n&#x27;</span>);</span><br><span class="line">  <span class="keyword">if</span> (!VirtReg.<span class="built_in">isSpillable</span>())</span><br><span class="line">    <span class="keyword">return</span> ~<span class="number">0u</span>;</span><br><span class="line">  <span class="function">LiveRangeEdit <span class="title">LRE</span><span class="params">(&amp;VirtReg, SplitVRegs, *MF, *LIS, VRM, <span class="keyword">this</span>, &amp;DeadRemats)</span></span>;</span><br><span class="line">  <span class="built_in">spiller</span>().<span class="built_in">spill</span>(LRE);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// The live virtual register requesting allocation was spilled, so tell</span></span><br><span class="line">  <span class="comment">// the caller not to allocate anything during this round.</span></span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure>]]>
    </content>
    <id>http://example.com/2026/03/22/llvm/regalloc_llvm1/</id>
    <link href="http://example.com/2026/03/22/llvm/regalloc_llvm1/"/>
    <published>2026-03-22T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>接上一篇。<br>列举下RABasic需要的数据结构</p>
<ul>
<li><p>LiveRegMatrix， LiveIntervals， VirtRegMap， LiveIntervalUnion</p>
</li>
<li><p>LiveIntervals</p>]]>
    </summary>
    <title>llvm寄存器分配1</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<h1 id="依赖的分析"><a href="#依赖的分析" class="headerlink" title="依赖的分析"></a>依赖的分析</h1><ul><li>value number</li><li>live range(liveness). 这块挺复杂的.</li><li>machine dom tree&#x2F;machine loop</li><li>block frequency</li></ul><h2 id="value-number"><a href="#value-number" class="headerlink" title="value number"></a>value number</h2><blockquote><p>VNInfo 能处理ssa和non-ssa格式的mir</p></blockquote><p>一个例子：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">; llc %s -march=aarch64 --stop-after=register-coalescer </span><br><span class="line"></span><br><span class="line">declare void @print(i32);</span><br><span class="line"></span><br><span class="line">define i32 @foo(i32 %a, i1 %c) &#123;</span><br><span class="line"></span><br><span class="line">bb0:</span><br><span class="line">    br i1 %c, label %bb1, label %bb2</span><br><span class="line">bb1:</span><br><span class="line">    call void @print(i32 %a)</span><br><span class="line">    br label %bb3</span><br><span class="line"></span><br><span class="line">bb2:</span><br><span class="line">    %a2 = add i32 %a, 1</span><br><span class="line">    br label %bb3</span><br><span class="line">bb3:</span><br><span class="line">    %a3 = phi i32 [%a, %bb1], [%a2, %bb2]</span><br><span class="line">    ret i32 %a3</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">; llc -march=aarch64 --start-after=register-coalescer -debug-only=regalloc -O1 %s</span><br><span class="line">bb.0.bb0:</span><br><span class="line">    successors: %bb.1(0x40000000), %bb.2(0x40000000)</span><br><span class="line">    liveins: $w0, $w1</span><br><span class="line">  </span><br><span class="line">    %3:gpr32 = COPY $w1</span><br><span class="line">    %5:gpr32common = COPY $w0</span><br><span class="line">    TBZW %3, 0, %bb.2</span><br><span class="line">    B %bb.1</span><br><span class="line">  </span><br><span class="line">  bb.2.bb2:</span><br><span class="line">    successors: %bb.3(0x80000000)</span><br><span class="line">  </span><br><span class="line">    %5:gpr32common = ADDWri %5, 1, 0</span><br><span class="line">    B %bb.3</span><br><span class="line"></span><br><span class="line">  bb.1.bb1:</span><br><span class="line">    successors: %bb.3(0x80000000)</span><br><span class="line">  </span><br><span class="line">    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp</span><br><span class="line">    $w0 = COPY %5</span><br><span class="line">    BL @print, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp</span><br><span class="line">    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp</span><br><span class="line">    B %bb.3</span><br><span class="line"></span><br><span class="line">  bb.3.bb3:</span><br><span class="line">    $w0 = COPY %5</span><br><span class="line">    RET_ReallyLR implicit $w0</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>可以看到live intervals信息：</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line">W0 [0B,32r:0)[160r,176r:2)[240r,256r:1) 0@0B-phi 1@240r 2@160r</span><br><span class="line">W1 [0B,16r:0) 0@0B-phi</span><br><span class="line">%3 [16r,48r:0) 0@16r  weight:0.000000e+00</span><br><span class="line">%5 [32r,96r:1)[96r,128B:0)[128B,224B:1)[224B,240r:2) 0@96r 1@32r 2@224B-phi  weight:0.000000e+00</span><br><span class="line">Function Live Ins: $w0 in %2, $w1 in %3</span><br><span class="line"></span><br><span class="line">0Bbb.0.bb0:</span><br><span class="line">  successors: %bb.2(0x40000000), %bb.1(0x40000000); %bb.2(50.00%), %bb.1(50.00%)</span><br><span class="line">  liveins: $w0, $w1</span><br><span class="line">16B  %3:gpr32 = COPY $w1</span><br><span class="line">32B  %5:gpr32common = COPY $w0</span><br><span class="line">48B  TBZW %3:gpr32, 0, %bb.1</span><br><span class="line">64B  B %bb.2</span><br><span class="line"></span><br><span class="line">80Bbb.1.bb2:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">96B  %5:gpr32common = ADDWri %5:gpr32common, 1, 0</span><br><span class="line">112B  B %bb.3</span><br><span class="line"></span><br><span class="line">128Bbb.2.bb1:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">144B  ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp</span><br><span class="line">160B  $w0 = COPY %5:gpr32common</span><br><span class="line">176B  BL @print, &lt;regmask $fp $lr $wzr $wzr_hi $xzr $b8 $b9 $b10 $b11 $b12 $b13 $b14 $b15 $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $h8 $h9 $h10 $h11 $h12 $h13 $h14 $h15 $s8 $s9 $s10 $s11 and 92 more...&gt;, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp</span><br><span class="line">192B  ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp</span><br><span class="line">208B  B %bb.3</span><br><span class="line"></span><br><span class="line">224Bbb.3.bb3:</span><br><span class="line">; predecessors: %bb.1, %bb.2</span><br><span class="line"></span><br><span class="line">240B  $w0 = COPY %5:gpr32common</span><br><span class="line">256B  RET_ReallyLR implicit $w0</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>可以看到<code>%5 [32r,96r:1)[96r,128B:0)[128B,224B:1)[224B,240r:2) 0@96r 1@32r 2@224B-phi</code>，就是phi所占用的live range。<br>由此，live intervals实现了精细的reaching define和liveness信息构建，这也是后续regalloc的基础。</p><hr><p>例子:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// clang a.c -O1 -mllvm -debug-only=regalloc</span></span><br><span class="line"><span class="comment">// x64</span></span><br><span class="line"><span class="keyword">extern</span> <span class="type">int</span> <span class="title function_">foo</span><span class="params">(<span class="type">int</span>)</span>;</span><br><span class="line"></span><br><span class="line"><span class="type">int</span> <span class="title function_">bar</span><span class="params">(<span class="type">int</span> a, <span class="type">int</span> b)</span> &#123;</span><br><span class="line">    <span class="keyword">if</span>(a&gt;b) &#123;</span><br><span class="line">       a = foo(a) +  a*b;</span><br><span class="line">         </span><br><span class="line">    &#125;<span class="keyword">else</span>&#123;</span><br><span class="line">        a = foo(a)*foo(a) + b;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> a*(b-a);</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>输出</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br><span class="line">200</span><br><span class="line">201</span><br><span class="line">202</span><br><span class="line">203</span><br><span class="line">204</span><br><span class="line">205</span><br><span class="line">206</span><br><span class="line">207</span><br><span class="line">208</span><br><span class="line">209</span><br><span class="line">210</span><br><span class="line">211</span><br><span class="line">212</span><br><span class="line">213</span><br><span class="line">214</span><br><span class="line">215</span><br><span class="line">216</span><br><span class="line">217</span><br><span class="line">218</span><br><span class="line">219</span><br><span class="line">220</span><br><span class="line">221</span><br><span class="line">222</span><br><span class="line">223</span><br><span class="line">224</span><br><span class="line">225</span><br><span class="line">226</span><br><span class="line">227</span><br><span class="line">228</span><br><span class="line">229</span><br><span class="line">230</span><br><span class="line">231</span><br><span class="line">232</span><br><span class="line">233</span><br><span class="line">234</span><br><span class="line">235</span><br><span class="line">236</span><br><span class="line">237</span><br><span class="line">238</span><br><span class="line">239</span><br><span class="line">240</span><br><span class="line">241</span><br><span class="line">242</span><br><span class="line">243</span><br><span class="line">244</span><br><span class="line">245</span><br><span class="line">246</span><br><span class="line">247</span><br><span class="line">248</span><br><span class="line">249</span><br><span class="line">250</span><br><span class="line">251</span><br><span class="line">252</span><br><span class="line">253</span><br><span class="line">254</span><br><span class="line">255</span><br><span class="line">256</span><br><span class="line">257</span><br><span class="line">258</span><br><span class="line">259</span><br><span class="line">260</span><br><span class="line">261</span><br><span class="line">262</span><br><span class="line">263</span><br><span class="line">264</span><br><span class="line">265</span><br><span class="line">266</span><br><span class="line">267</span><br><span class="line">268</span><br><span class="line">269</span><br><span class="line">270</span><br><span class="line">271</span><br><span class="line">272</span><br><span class="line">273</span><br><span class="line">274</span><br><span class="line">275</span><br><span class="line">276</span><br><span class="line">277</span><br><span class="line">278</span><br><span class="line">279</span><br><span class="line">280</span><br><span class="line">281</span><br><span class="line">282</span><br><span class="line">283</span><br><span class="line">284</span><br><span class="line">285</span><br><span class="line">286</span><br><span class="line">287</span><br><span class="line">288</span><br><span class="line">289</span><br><span class="line">290</span><br><span class="line">291</span><br><span class="line">292</span><br><span class="line">293</span><br><span class="line">294</span><br><span class="line">295</span><br><span class="line">296</span><br><span class="line">297</span><br><span class="line">298</span><br><span class="line">299</span><br><span class="line">300</span><br><span class="line">301</span><br><span class="line">302</span><br><span class="line">303</span><br><span class="line">304</span><br><span class="line">305</span><br><span class="line">306</span><br><span class="line">307</span><br><span class="line">308</span><br><span class="line">309</span><br><span class="line">310</span><br><span class="line">311</span><br><span class="line">312</span><br><span class="line">313</span><br><span class="line">314</span><br><span class="line">315</span><br><span class="line">316</span><br><span class="line">317</span><br><span class="line">318</span><br><span class="line">319</span><br><span class="line">320</span><br><span class="line">321</span><br><span class="line">322</span><br><span class="line">323</span><br><span class="line">324</span><br><span class="line">325</span><br><span class="line">326</span><br><span class="line">327</span><br><span class="line">328</span><br><span class="line">329</span><br><span class="line">330</span><br><span class="line">331</span><br><span class="line">332</span><br><span class="line">333</span><br><span class="line">334</span><br><span class="line">335</span><br><span class="line">336</span><br><span class="line">337</span><br><span class="line">338</span><br><span class="line">339</span><br><span class="line">340</span><br><span class="line">341</span><br><span class="line">342</span><br><span class="line">343</span><br><span class="line">344</span><br><span class="line">345</span><br><span class="line">346</span><br><span class="line">347</span><br><span class="line">348</span><br><span class="line">349</span><br><span class="line">350</span><br><span class="line">351</span><br><span class="line">352</span><br><span class="line">353</span><br><span class="line">354</span><br><span class="line">355</span><br><span class="line">356</span><br><span class="line">357</span><br><span class="line">358</span><br><span class="line">359</span><br><span class="line">360</span><br><span class="line">361</span><br><span class="line">362</span><br><span class="line">363</span><br><span class="line">364</span><br><span class="line">365</span><br><span class="line">366</span><br><span class="line">367</span><br><span class="line">368</span><br><span class="line">369</span><br><span class="line">370</span><br><span class="line">371</span><br><span class="line">372</span><br><span class="line">373</span><br><span class="line">374</span><br><span class="line">375</span><br><span class="line">376</span><br><span class="line">377</span><br><span class="line">378</span><br><span class="line">379</span><br><span class="line">380</span><br><span class="line">381</span><br><span class="line">382</span><br><span class="line">383</span><br><span class="line">384</span><br><span class="line">385</span><br><span class="line">386</span><br><span class="line">387</span><br><span class="line">388</span><br><span class="line">389</span><br><span class="line">390</span><br><span class="line">391</span><br><span class="line">392</span><br><span class="line">393</span><br><span class="line">394</span><br><span class="line">395</span><br><span class="line">396</span><br><span class="line">397</span><br><span class="line">398</span><br><span class="line">399</span><br><span class="line">400</span><br><span class="line">401</span><br><span class="line">402</span><br><span class="line">403</span><br><span class="line">404</span><br><span class="line">405</span><br><span class="line">406</span><br><span class="line">407</span><br><span class="line">408</span><br><span class="line">409</span><br><span class="line">410</span><br><span class="line">411</span><br><span class="line">412</span><br><span class="line">413</span><br><span class="line">414</span><br><span class="line">415</span><br><span class="line">416</span><br><span class="line">417</span><br><span class="line">418</span><br><span class="line">419</span><br><span class="line">420</span><br><span class="line">421</span><br><span class="line">422</span><br><span class="line">423</span><br><span class="line">424</span><br><span class="line">425</span><br><span class="line">426</span><br><span class="line">427</span><br><span class="line">428</span><br><span class="line">429</span><br><span class="line">430</span><br><span class="line">431</span><br><span class="line">432</span><br><span class="line">433</span><br><span class="line">434</span><br><span class="line">435</span><br><span class="line">436</span><br><span class="line">437</span><br><span class="line">438</span><br><span class="line">439</span><br><span class="line">440</span><br><span class="line">441</span><br><span class="line">442</span><br><span class="line">443</span><br><span class="line">444</span><br><span class="line">445</span><br><span class="line">446</span><br><span class="line">447</span><br><span class="line">448</span><br><span class="line">449</span><br><span class="line">450</span><br><span class="line">451</span><br><span class="line">452</span><br><span class="line">453</span><br><span class="line">454</span><br><span class="line">455</span><br><span class="line">456</span><br><span class="line">457</span><br><span class="line">458</span><br><span class="line">459</span><br><span class="line">460</span><br><span class="line">461</span><br><span class="line">462</span><br><span class="line">463</span><br><span class="line">464</span><br><span class="line">465</span><br><span class="line">466</span><br><span class="line">467</span><br><span class="line">468</span><br><span class="line">469</span><br><span class="line">470</span><br><span class="line">471</span><br><span class="line">472</span><br><span class="line">473</span><br><span class="line">474</span><br><span class="line">475</span><br><span class="line">476</span><br><span class="line">477</span><br><span class="line">478</span><br><span class="line">479</span><br><span class="line">480</span><br><span class="line">481</span><br><span class="line">482</span><br><span class="line">483</span><br><span class="line">484</span><br><span class="line">485</span><br><span class="line">486</span><br><span class="line">487</span><br><span class="line">488</span><br><span class="line">489</span><br><span class="line">490</span><br><span class="line">491</span><br><span class="line">492</span><br><span class="line">493</span><br><span class="line">494</span><br><span class="line">495</span><br><span class="line">496</span><br><span class="line">497</span><br><span class="line">498</span><br><span class="line">499</span><br><span class="line">500</span><br><span class="line">501</span><br><span class="line">502</span><br><span class="line">503</span><br><span class="line">504</span><br><span class="line">505</span><br><span class="line">506</span><br><span class="line">507</span><br><span class="line">508</span><br><span class="line">509</span><br><span class="line">510</span><br><span class="line">511</span><br><span class="line">512</span><br><span class="line">513</span><br><span class="line">514</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">Computing live-in reg-units in ABI blocks.</span><br><span class="line">0B%bb.0 DIL#0 DIH#0 HDI#0 SIL#0 SIH#0 HSI#0</span><br><span class="line">Created 6 new intervals.</span><br><span class="line">********** INTERVALS **********</span><br><span class="line">DIL [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">DIH [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">HDI [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">SIL [0B,16r:0) 0@0B-phi</span><br><span class="line">SIH [0B,16r:0) 0@0B-phi</span><br><span class="line">HSI [0B,16r:0) 0@0B-phi</span><br><span class="line">%1 [224r,240r:0)[240r,256r:1) 0@224r 1@240r  weight:0.000000e+00</span><br><span class="line">%2 [416r,432r:0)[432r,448r:1) 0@416r 1@432r  weight:0.000000e+00</span><br><span class="line">%3 [480r,544r:0) 0@480r  weight:0.000000e+00</span><br><span class="line">%4 [32r,192r:0)[288B,320r:0) 0@32r  weight:0.000000e+00</span><br><span class="line">%5 [16r,496r:0) 0@16r  weight:0.000000e+00</span><br><span class="line">%6 [112r,224r:0)[288B,400r:0) 0@112r  weight:0.000000e+00</span><br><span class="line">%8 [368r,384r:0) 0@368r  weight:0.000000e+00</span><br><span class="line">%9 [384r,400r:0)[400r,416r:1) 0@384r 1@400r  weight:0.000000e+00</span><br><span class="line">%10 [192r,208r:0)[208r,240r:1) 0@192r 1@208r  weight:0.000000e+00</span><br><span class="line">%11 [496r,512r:0)[512r,528r:1) 0@496r 1@512r  weight:0.000000e+00</span><br><span class="line">%12 [528r,544r:0)[544r,560r:1) 0@528r 1@544r  weight:0.000000e+00</span><br><span class="line">%13 [256r,288B:1)[448r,464B:0)[464B,480r:2) 0@448r 1@256r 2@464B-phi  weight:0.000000e+00</span><br><span class="line">RegMasks: 80r 336r</span><br><span class="line">********** MACHINEINSTRS **********</span><br><span class="line"># Machine code for function bar: NoPHIs, TracksLiveness, TiedOpsRewritten</span><br><span class="line">Function Live Ins: $edi in %4, $esi in %5</span><br><span class="line"></span><br><span class="line">0Bbb.0.entry:</span><br><span class="line">  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)</span><br><span class="line">  liveins: $edi, $esi</span><br><span class="line">  DBG_VALUE $edi, $noreg, !&quot;a&quot;, !DIExpression(), debug-location !22; example.c:0 line no:50</span><br><span class="line">  DBG_VALUE $esi, $noreg, !&quot;b&quot;, !DIExpression(), debug-location !22; example.c:0 line no:50</span><br><span class="line">16B  %5:gr32 = COPY $esi</span><br><span class="line">32B  %4:gr32 = COPY $edi</span><br><span class="line">48B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">64B  $edi = COPY %4:gr32, debug-location !25; example.c:0</span><br><span class="line">80B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !25; example.c:0</span><br><span class="line">96B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">112B  %6:gr32 = COPY killed $eax, debug-location !25; example.c:0</span><br><span class="line">128B  CMP32rr %4:gr32, %5:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">144B  JCC_1 %bb.2, 14, implicit killed $eflags, debug-location !26; example.c:51:9</span><br><span class="line">160B  JMP_1 %bb.1, debug-location !26; example.c:51:9</span><br><span class="line"></span><br><span class="line">176Bbb.1.if.then:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">192B  %10:gr32 = COPY %4:gr32, debug-location !27; example.c:52:23</span><br><span class="line">208B  %10:gr32 = nsw IMUL32rr %10:gr32(tied-def 0), %5:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">224B  %1:gr32 = COPY %6:gr32, debug-location !29; example.c:52:19</span><br><span class="line">240B  %1:gr32 = nsw ADD32rr %1:gr32(tied-def 0), %10:gr32, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">  DBG_INSTR_REF !&quot;a&quot;, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(2, 0), debug-location !22; example.c:0 line no:50</span><br><span class="line">256B  %13:gr32 = COPY %1:gr32</span><br><span class="line">272B  JMP_1 %bb.3, debug-location !30; example.c:54:5</span><br><span class="line"></span><br><span class="line">288Bbb.2.if.else:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">304B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">320B  $edi = COPY %4:gr32, debug-location !31; example.c:55:20</span><br><span class="line">336B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !31; example.c:55:20</span><br><span class="line">352B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">368B  %8:gr32 = COPY killed $eax, debug-location !31; example.c:55:20</span><br><span class="line">384B  %9:gr32 = COPY %8:gr32, debug-location !33; example.c:55:19</span><br><span class="line">400B  %9:gr32 = nsw IMUL32rr %9:gr32(tied-def 0), %6:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">416B  %2:gr32 = COPY %9:gr32, debug-location !34; example.c:55:27</span><br><span class="line">432B  %2:gr32 = nsw ADD32rr %2:gr32(tied-def 0), %5:gr32, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">  DBG_INSTR_REF !&quot;a&quot;, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !22; example.c:0 line no:50</span><br><span class="line">448B  %13:gr32 = COPY %2:gr32</span><br><span class="line"></span><br><span class="line">464Bbb.3.if.end:</span><br><span class="line">; predecessors: %bb.2, %bb.1</span><br><span class="line"></span><br><span class="line">480B  %3:gr32 = COPY %13:gr32, debug-location !25; example.c:0</span><br><span class="line">  DBG_INSTR_REF !&quot;a&quot;, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(3, 0), debug-location !22; example.c:0 line no:50</span><br><span class="line">496B  %11:gr32 = COPY %5:gr32, debug-location !35; example.c:57:16</span><br><span class="line">512B  %11:gr32 = nsw SUB32rr %11:gr32(tied-def 0), %3:gr32, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">528B  %12:gr32 = COPY %11:gr32, debug-location !36; example.c:57:13</span><br><span class="line">544B  %12:gr32 = nsw IMUL32rr %12:gr32(tied-def 0), %3:gr32, implicit-def dead $eflags, debug-location !36; example.c:57:13</span><br><span class="line">560B  $eax = COPY %12:gr32, debug-location !37; example.c:57:5</span><br><span class="line">576B  RET 0, killed $eax, debug-location !37; example.c:57:5</span><br><span class="line"></span><br><span class="line"># End machine code for function bar.</span><br><span class="line"></span><br><span class="line">********** REGISTER COALESCER **********</span><br><span class="line">********** Function: bar</span><br><span class="line">********** JOINING INTERVALS ***********</span><br><span class="line">entry:</span><br><span class="line">16B%5:gr32 = COPY $esi</span><br><span class="line">Considering merging %5 with $esi</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">32B%4:gr32 = COPY $edi</span><br><span class="line">Considering merging %4 with $edi</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">64B$edi = COPY %4:gr32, debug-location !25; example.c:0</span><br><span class="line">Considering merging %4 with $edi</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">112B%6:gr32 = COPY killed $eax, debug-location !25; example.c:0</span><br><span class="line">Considering merging %6 with $eax</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">if.then:</span><br><span class="line">if.else:</span><br><span class="line">320B$edi = COPY %4:gr32, debug-location !31; example.c:55:20</span><br><span class="line">Considering merging %4 with $edi</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">368B%8:gr32 = COPY killed $eax, debug-location !31; example.c:55:20</span><br><span class="line">Considering merging %8 with $eax</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">if.end:</span><br><span class="line">560B$eax = COPY %12:gr32, debug-location !37; example.c:57:5</span><br><span class="line">Considering merging %12 with $eax</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">192B%10:gr32 = COPY %4:gr32, debug-location !27; example.c:52:23</span><br><span class="line">AllocationOrder(GR32) = [ $eax $ecx $edx $esi $edi $r8d $r9d $r10d $r11d $ebx $ebp $r14d $r15d $r12d $r13d ]</span><br><span class="line">Considering merging to GR32 with %4 in %10</span><br><span class="line">RHS = %4 [32r,192r:0)[288B,320r:0) 0@32r  weight:0.000000e+00</span><br><span class="line">LHS = %10 [192r,208r:0)[208r,240r:1) 0@192r 1@208r  weight:0.000000e+00</span><br><span class="line">merge %10:0@192r into %4:0@32r --&gt; @32r</span><br><span class="line">erased:192r%10:gr32 = COPY %4:gr32, debug-location !27; example.c:52:23</span><br><span class="line">updated: 32B%10:gr32 = COPY $edi</span><br><span class="line">updated: 64B$edi = COPY %10:gr32, debug-location !25; example.c:0</span><br><span class="line">updated: 128BCMP32rr %10:gr32, %5:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">updated: 320B$edi = COPY %10:gr32, debug-location !31; example.c:55:20</span><br><span class="line">Success: %4 -&gt; %10</span><br><span class="line">Result = %10 [32r,208r:0)[208r,240r:1)[288B,320r:0) 0@32r 1@208r  weight:0.000000e+00</span><br><span class="line">224B%1:gr32 = COPY %6:gr32, debug-location !29; example.c:52:19</span><br><span class="line">Considering merging to GR32 with %6 in %1</span><br><span class="line">RHS = %6 [112r,224r:0)[288B,400r:0) 0@112r  weight:0.000000e+00</span><br><span class="line">LHS = %1 [224r,240r:0)[240r,256r:1) 0@224r 1@240r  weight:0.000000e+00</span><br><span class="line">merge %1:0@224r into %6:0@112r --&gt; @112r</span><br><span class="line">erased:224r%1:gr32 = COPY %6:gr32, debug-location !29; example.c:52:19</span><br><span class="line">updated: 112B%1:gr32 = COPY killed $eax, debug-location !25; example.c:0</span><br><span class="line">updated: 400B%9:gr32 = nsw IMUL32rr %9:gr32(tied-def 0), %1:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">Success: %6 -&gt; %1</span><br><span class="line">Result = %1 [112r,240r:0)[240r,256r:1)[288B,400r:0) 0@112r 1@240r  weight:0.000000e+00</span><br><span class="line">256B%13:gr32 = COPY %1:gr32</span><br><span class="line">Considering merging to GR32 with %1 in %13</span><br><span class="line">RHS = %1 [112r,240r:0)[240r,256r:1)[288B,400r:0) 0@112r 1@240r  weight:0.000000e+00</span><br><span class="line">LHS = %13 [256r,288B:1)[448r,464B:0)[464B,480r:2) 0@448r 1@256r 2@464B-phi  weight:0.000000e+00</span><br><span class="line">merge %13:1@256r into %1:1@240r --&gt; @240r</span><br><span class="line">erased:256r%13:gr32 = COPY %1:gr32</span><br><span class="line">updated: 112B%13:gr32 = COPY killed $eax, debug-location !25; example.c:0</span><br><span class="line">updated: 240B%13:gr32 = nsw ADD32rr %13:gr32(tied-def 0), %10:gr32, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">updated: 400B%9:gr32 = nsw IMUL32rr %9:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">Success: %1 -&gt; %13</span><br><span class="line">Result = %13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,480r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:0.000000e+00</span><br><span class="line">384B%9:gr32 = COPY %8:gr32, debug-location !33; example.c:55:19</span><br><span class="line">Considering merging to GR32 with %8 in %9</span><br><span class="line">RHS = %8 [368r,384r:0) 0@368r  weight:0.000000e+00</span><br><span class="line">LHS = %9 [384r,400r:0)[400r,416r:1) 0@384r 1@400r  weight:0.000000e+00</span><br><span class="line">merge %9:0@384r into %8:0@368r --&gt; @368r</span><br><span class="line">erased:384r%9:gr32 = COPY %8:gr32, debug-location !33; example.c:55:19</span><br><span class="line">updated: 368B%9:gr32 = COPY killed $eax, debug-location !31; example.c:55:20</span><br><span class="line">Success: %8 -&gt; %9</span><br><span class="line">Result = %9 [368r,400r:0)[400r,416r:1) 0@368r 1@400r  weight:0.000000e+00</span><br><span class="line">416B%2:gr32 = COPY %9:gr32, debug-location !34; example.c:55:27</span><br><span class="line">Considering merging to GR32 with %9 in %2</span><br><span class="line">RHS = %9 [368r,400r:0)[400r,416r:1) 0@368r 1@400r  weight:0.000000e+00</span><br><span class="line">LHS = %2 [416r,432r:0)[432r,448r:1) 0@416r 1@432r  weight:0.000000e+00</span><br><span class="line">merge %2:0@416r into %9:1@400r --&gt; @400r</span><br><span class="line">erased:416r%2:gr32 = COPY %9:gr32, debug-location !34; example.c:55:27</span><br><span class="line">updated: 368B%2:gr32 = COPY killed $eax, debug-location !31; example.c:55:20</span><br><span class="line">updated: 400B%2:gr32 = nsw IMUL32rr %2:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">Success: %9 -&gt; %2</span><br><span class="line">Result = %2 [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r  weight:0.000000e+00</span><br><span class="line">448B%13:gr32 = COPY %2:gr32</span><br><span class="line">Considering merging to GR32 with %2 in %13</span><br><span class="line">RHS = %2 [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r  weight:0.000000e+00</span><br><span class="line">LHS = %13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,480r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:0.000000e+00</span><br><span class="line">merge %13:0@448r into %2:1@432r --&gt; @432r</span><br><span class="line">interference at %2:2@368r</span><br><span class="line">Interference!</span><br><span class="line">480B%3:gr32 = COPY %13:gr32, debug-location !25; example.c:0</span><br><span class="line">Considering merging to GR32 with %3 in %13</span><br><span class="line">RHS = %3 [480r,544r:0) 0@480r  weight:0.000000e+00</span><br><span class="line">LHS = %13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,480r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:0.000000e+00</span><br><span class="line">merge %3:0@480r into %13:2@464B --&gt; @464B</span><br><span class="line">erased:480r%3:gr32 = COPY %13:gr32, debug-location !25; example.c:0</span><br><span class="line">updated: 512B%11:gr32 = nsw SUB32rr %11:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">updated: 544B%12:gr32 = nsw IMUL32rr %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !36; example.c:57:13</span><br><span class="line">Success: %3 -&gt; %13</span><br><span class="line">Result = %13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:0.000000e+00</span><br><span class="line">496B%11:gr32 = COPY %5:gr32, debug-location !35; example.c:57:16</span><br><span class="line">Considering merging to GR32 with %5 in %11</span><br><span class="line">RHS = %5 [16r,496r:0) 0@16r  weight:0.000000e+00</span><br><span class="line">LHS = %11 [496r,512r:0)[512r,528r:1) 0@496r 1@512r  weight:0.000000e+00</span><br><span class="line">merge %11:0@496r into %5:0@16r --&gt; @16r</span><br><span class="line">erased:496r%11:gr32 = COPY %5:gr32, debug-location !35; example.c:57:16</span><br><span class="line">updated: 16B%11:gr32 = COPY $esi</span><br><span class="line">updated: 128BCMP32rr %10:gr32, %11:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">updated: 432B%2:gr32 = nsw ADD32rr %2:gr32(tied-def 0), %11:gr32, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">updated: 208B%10:gr32 = nsw IMUL32rr %10:gr32(tied-def 0), %11:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">Success: %5 -&gt; %11</span><br><span class="line">Result = %11 [16r,512r:0)[512r,528r:1) 0@16r 1@512r  weight:0.000000e+00</span><br><span class="line">528B%12:gr32 = COPY %11:gr32, debug-location !36; example.c:57:13</span><br><span class="line">Considering merging to GR32 with %11 in %12</span><br><span class="line">RHS = %11 [16r,512r:0)[512r,528r:1) 0@16r 1@512r  weight:0.000000e+00</span><br><span class="line">LHS = %12 [528r,544r:0)[544r,560r:1) 0@528r 1@544r  weight:0.000000e+00</span><br><span class="line">merge %12:0@528r into %11:1@512r --&gt; @512r</span><br><span class="line">erased:528r%12:gr32 = COPY %11:gr32, debug-location !36; example.c:57:13</span><br><span class="line">updated: 16B%12:gr32 = COPY $esi</span><br><span class="line">updated: 512B%12:gr32 = nsw SUB32rr %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">updated: 128BCMP32rr %10:gr32, %12:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">updated: 432B%2:gr32 = nsw ADD32rr %2:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">updated: 208B%10:gr32 = nsw IMUL32rr %10:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">Success: %11 -&gt; %12</span><br><span class="line">Result = %12 [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r  weight:0.000000e+00</span><br><span class="line">64B$edi = COPY %10:gr32, debug-location !25; example.c:0</span><br><span class="line">Considering merging %10 with $edi</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">320B$edi = COPY %10:gr32, debug-location !31; example.c:55:20</span><br><span class="line">Considering merging %10 with $edi</span><br><span class="line">Can only merge into reserved registers.</span><br><span class="line">448B%13:gr32 = COPY %2:gr32</span><br><span class="line">Considering merging to GR32 with %2 in %13</span><br><span class="line">RHS = %2 [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r  weight:0.000000e+00</span><br><span class="line">LHS = %13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:0.000000e+00</span><br><span class="line">merge %13:0@448r into %2:1@432r --&gt; @432r</span><br><span class="line">interference at %2:2@368r</span><br><span class="line">Interference!</span><br><span class="line">Trying to inflate 0 regs.</span><br><span class="line">********** INTERVALS **********</span><br><span class="line">DIL [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">DIH [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">HDI [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">SIL [0B,16r:0) 0@0B-phi</span><br><span class="line">SIH [0B,16r:0) 0@0B-phi</span><br><span class="line">HSI [0B,16r:0) 0@0B-phi</span><br><span class="line">%2 [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r  weight:0.000000e+00</span><br><span class="line">%10 [32r,208r:0)[208r,240r:1)[288B,320r:0) 0@32r 1@208r  weight:0.000000e+00</span><br><span class="line">%12 [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r  weight:0.000000e+00</span><br><span class="line">%13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:0.000000e+00</span><br><span class="line">RegMasks: 80r 336r</span><br><span class="line">********** MACHINEINSTRS **********</span><br><span class="line"># Machine code for function bar: NoPHIs, TracksLiveness, TiedOpsRewritten</span><br><span class="line">Function Live Ins: $edi in %4, $esi in %5</span><br><span class="line"></span><br><span class="line">0Bbb.0.entry:</span><br><span class="line">  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)</span><br><span class="line">  liveins: $edi, $esi</span><br><span class="line">  DBG_VALUE $edi, $noreg, !&quot;a&quot;, !DIExpression(), debug-location !22; example.c:0 line no:50</span><br><span class="line">  DBG_VALUE $esi, $noreg, !&quot;b&quot;, !DIExpression(), debug-location !22; example.c:0 line no:50</span><br><span class="line">16B  %12:gr32 = COPY $esi</span><br><span class="line">32B  %10:gr32 = COPY $edi</span><br><span class="line">48B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">64B  $edi = COPY %10:gr32, debug-location !25; example.c:0</span><br><span class="line">80B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !25; example.c:0</span><br><span class="line">96B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">112B  %13:gr32 = COPY killed $eax, debug-location !25; example.c:0</span><br><span class="line">128B  CMP32rr %10:gr32, %12:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">144B  JCC_1 %bb.2, 14, implicit killed $eflags, debug-location !26; example.c:51:9</span><br><span class="line">160B  JMP_1 %bb.1, debug-location !26; example.c:51:9</span><br><span class="line"></span><br><span class="line">176Bbb.1.if.then:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">208B  %10:gr32 = nsw IMUL32rr %10:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">240B  %13:gr32 = nsw ADD32rr %13:gr32(tied-def 0), %10:gr32, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">  DBG_INSTR_REF !&quot;a&quot;, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(2, 0), debug-location !22; example.c:0 line no:50</span><br><span class="line">272B  JMP_1 %bb.3, debug-location !30; example.c:54:5</span><br><span class="line"></span><br><span class="line">288Bbb.2.if.else:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">304B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">320B  $edi = COPY %10:gr32, debug-location !31; example.c:55:20</span><br><span class="line">336B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !31; example.c:55:20</span><br><span class="line">352B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">368B  %2:gr32 = COPY killed $eax, debug-location !31; example.c:55:20</span><br><span class="line">400B  %2:gr32 = nsw IMUL32rr %2:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">432B  %2:gr32 = nsw ADD32rr %2:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">  DBG_INSTR_REF !&quot;a&quot;, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(1, 0), debug-location !22; example.c:0 line no:50</span><br><span class="line">448B  %13:gr32 = COPY %2:gr32</span><br><span class="line"></span><br><span class="line">464Bbb.3.if.end:</span><br><span class="line">; predecessors: %bb.2, %bb.1</span><br><span class="line"></span><br><span class="line">  DBG_INSTR_REF !&quot;a&quot;, !DIExpression(DW_OP_LLVM_arg, 0), dbg-instr-ref(3, 0), debug-location !22; example.c:0 line no:50</span><br><span class="line">512B  %12:gr32 = nsw SUB32rr %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">544B  %12:gr32 = nsw IMUL32rr %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !36; example.c:57:13</span><br><span class="line">560B  $eax = COPY %12:gr32, debug-location !37; example.c:57:5</span><br><span class="line">576B  RET 0, killed $eax, debug-location !37; example.c:57:5</span><br><span class="line"></span><br><span class="line"># End machine code for function bar.</span><br><span class="line"></span><br><span class="line">AllocationOrder(GR64) = [ $rax $rcx $rdx $rsi $rdi $r8 $r9 $r10 $r11 $rbx $r14 $r15 $r12 $r13 $rbp ]</span><br><span class="line">********** GREEDY REGISTER ALLOCATION **********</span><br><span class="line">********** Function: bar</span><br><span class="line">********** GREEDY REGISTER ALLOCATION **********</span><br><span class="line">********** Function: bar</span><br><span class="line">********** INTERVALS **********</span><br><span class="line">DIL [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">DIH [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">HDI [0B,32r:0)[64r,80r:2)[320r,336r:1) 0@0B-phi 1@320r 2@64r</span><br><span class="line">SIL [0B,16r:0) 0@0B-phi</span><br><span class="line">SIH [0B,16r:0) 0@0B-phi</span><br><span class="line">HSI [0B,16r:0) 0@0B-phi</span><br><span class="line">%2 [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r  weight:INF</span><br><span class="line">%10 [32r,208r:0)[208r,240r:1)[288B,320r:0) 0@32r 1@208r  weight:7.866044e-03</span><br><span class="line">%12 [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r  weight:8.559322e-03</span><br><span class="line">%13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:6.441327e-03</span><br><span class="line">RegMasks: 80r 336r</span><br><span class="line">********** MACHINEINSTRS **********</span><br><span class="line"># Machine code for function bar: NoPHIs, TracksLiveness, TiedOpsRewritten, TracksDebugUserValues</span><br><span class="line">Function Live Ins: $edi in %4, $esi in %5</span><br><span class="line"></span><br><span class="line">0Bbb.0.entry:</span><br><span class="line">  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)</span><br><span class="line">  liveins: $edi, $esi</span><br><span class="line">16B  %12:gr32 = COPY $esi</span><br><span class="line">32B  %10:gr32 = COPY $edi</span><br><span class="line">48B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">64B  $edi = COPY %10:gr32, debug-location !25; example.c:0</span><br><span class="line">80B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !25; example.c:0</span><br><span class="line">96B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">112B  %13:gr32 = COPY killed $eax, debug-location !25; example.c:0</span><br><span class="line">128B  CMP32rr %10:gr32, %12:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">144B  JCC_1 %bb.2, 14, implicit killed $eflags, debug-location !26; example.c:51:9</span><br><span class="line">160B  JMP_1 %bb.1, debug-location !26; example.c:51:9</span><br><span class="line"></span><br><span class="line">176Bbb.1.if.then:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">208B  %10:gr32 = nsw IMUL32rr %10:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">240B  %13:gr32 = nsw ADD32rr %13:gr32(tied-def 0), %10:gr32, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">272B  JMP_1 %bb.3, debug-location !30; example.c:54:5</span><br><span class="line"></span><br><span class="line">288Bbb.2.if.else:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line"></span><br><span class="line">304B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">320B  $edi = COPY %10:gr32, debug-location !31; example.c:55:20</span><br><span class="line">336B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !31; example.c:55:20</span><br><span class="line">352B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">368B  %2:gr32 = COPY killed $eax, debug-location !31; example.c:55:20</span><br><span class="line">400B  %2:gr32 = nsw IMUL32rr %2:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">432B  %2:gr32 = nsw ADD32rr %2:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">448B  %13:gr32 = COPY %2:gr32</span><br><span class="line"></span><br><span class="line">464Bbb.3.if.end:</span><br><span class="line">; predecessors: %bb.2, %bb.1</span><br><span class="line"></span><br><span class="line">512B  %12:gr32 = nsw SUB32rr %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">544B  %12:gr32 = nsw IMUL32rr %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !36; example.c:57:13</span><br><span class="line">560B  $eax = COPY %12:gr32, debug-location !37; example.c:57:5</span><br><span class="line">576B  RET 0, killed $eax, debug-location !37; example.c:57:5</span><br><span class="line"></span><br><span class="line"># End machine code for function bar.</span><br><span class="line"></span><br><span class="line">Enqueuing %2</span><br><span class="line">AllocationOrder(GR32) = [ $eax $ecx $edx $esi $edi $r8d $r9d $r10d $r11d $ebx $ebp $r14d $r15d $r12d $r13d ]</span><br><span class="line">Enqueuing %10</span><br><span class="line">Enqueuing %12</span><br><span class="line">Enqueuing %13</span><br><span class="line"></span><br><span class="line">selectOrSplit GR32:%12 [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r  weight:8.559322e-03</span><br><span class="line">hints: $eax $esi</span><br><span class="line">missed hint $eax</span><br><span class="line">Analyze counted 7 instrs in 4 blocks, through 0 blocks.</span><br><span class="line">$eaxstatic = 1.5 worse than no bundles</span><br><span class="line">assigning %12 to $ebx: BH [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r BL [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r HBX [16r,512r:2)[512r,544r:0)[544r,560r:1) 0@512r 1@544r 2@16r</span><br><span class="line"></span><br><span class="line">selectOrSplit GR32:%13 [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r  weight:6.441327e-03</span><br><span class="line">hints: $eax</span><br><span class="line">missed hint $eax</span><br><span class="line">Analyze counted 6 instrs in 5 blocks, through 0 blocks.</span><br><span class="line">$eaxstatic = 0.5, v=0, total = 1.0 with bundles EB#1 EB#2.</span><br><span class="line">assigning %13 to $ebp: BPL [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r BPH [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r HBP [112r,240r:3)[240r,288B:1)[288B,400r:3)[448r,464B:0)[464B,544r:2) 0@448r 1@240r 2@464B-phi 3@112r</span><br><span class="line"></span><br><span class="line">selectOrSplit GR32:%10 [32r,208r:0)[208r,240r:1)[288B,320r:0) 0@32r 1@208r  weight:7.866044e-03</span><br><span class="line">hints: $edi</span><br><span class="line">missed hint $edi</span><br><span class="line">Analyze counted 6 instrs in 3 blocks, through 0 blocks.</span><br><span class="line">$edistatic = 1.0, v=0, total = 1.0 with bundles EB#1.</span><br><span class="line">Split for $edi in 1 bundles, intv 1.</span><br><span class="line">splitAroundRegion with 2 globals.</span><br><span class="line">%bb.0 [0B;176B), uses 32r-128r, reg-out 1, enter after 80d, defined in block, interference overlaps uses.</span><br><span class="line">    selectIntv 1 -&gt; 1</span><br><span class="line">    enterIntvAfter 80d: valno 0</span><br><span class="line">    useIntv [88r;176B): [88r;176B):1</span><br><span class="line">    enterIntvBefore 32r: not live</span><br><span class="line">    useIntv [32B;88r): [32B;88r):2 [88r;176B):1</span><br><span class="line">%bb.1 [176B;288B), uses 208r-240r, reg-in 1, leave before invalid, killed in block before interference.</span><br><span class="line">    selectIntv 2 -&gt; 1</span><br><span class="line">    useIntv [176B;240r): [32B;88r):2 [88r;240r):1</span><br><span class="line">%bb.2 [288B;464B), uses 320r-320r, reg-in 1, leave before 320r, killed in block before interference.</span><br><span class="line">    selectIntv 1 -&gt; 1</span><br><span class="line">    useIntv [288B;320r): [32B;88r):2 [88r;240r):1 [288B;320r):1</span><br><span class="line">Removing 0 back-copies.</span><br><span class="line">  blit [32r,208r:0): [32r;88r)=2(%16):0 [88r;208r)=1(%15):0</span><br><span class="line">  blit [208r,240r:1): [208r;240r)=1(%15):1</span><br><span class="line">  blit [288B,320r:0): [288B;320r)=1(%15):0</span><br><span class="line">  rewr %bb.032r:2%16:gr32 = COPY $edi</span><br><span class="line">  rewr %bb.1208r:1%15:gr32 = nsw IMUL32rr %10:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">  rewr %bb.1240B:1%13:gr32 = nsw ADD32rr %13:gr32(tied-def 0), %15:gr32, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">  rewr %bb.1208B:1%15:gr32 = nsw IMUL32rr %15:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">  rewr %bb.064B:2$edi = COPY %16:gr32, debug-location !25; example.c:0</span><br><span class="line">  rewr %bb.0128B:1CMP32rr %15:gr32, %12:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">  rewr %bb.2320B:1$edi = COPY %15:gr32, debug-location !31; example.c:55:20</span><br><span class="line">  rewr %bb.088B:2%15:gr32 = COPY %16:gr32</span><br><span class="line">Main interval covers the same 3 blocks as original.</span><br><span class="line">not queueing unused  %14 EMPTY  weight:INF</span><br><span class="line">queuing new interval: %15 [88r,208r:0)[208r,240r:1)[288B,320r:0) 0@88r 1@208r  weight:6.894198e-03</span><br><span class="line">Enqueuing %15</span><br><span class="line">queuing new interval: %16 [32r,88r:0) 0@32r  weight:6.644737e-03</span><br><span class="line">Enqueuing %16</span><br><span class="line"></span><br><span class="line">selectOrSplit GR32:%15 [88r,208r:0)[208r,240r:1)[288B,320r:0) 0@88r 1@208r  weight:6.894198e-03</span><br><span class="line">hints: $edi</span><br><span class="line">assigning %15 to $edi: DIL [88r,208r:0)[208r,240r:1)[288B,320r:0) 0@88r 1@208r DIH [88r,208r:0)[208r,240r:1)[288B,320r:0) 0@88r 1@208r HDI [88r,208r:0)[208r,240r:1)[288B,320r:0) 0@88r 1@208r</span><br><span class="line"></span><br><span class="line">selectOrSplit GR32:%16 [32r,88r:0) 0@32r  weight:6.644737e-03</span><br><span class="line">hints: $edi</span><br><span class="line">missed hint $edi</span><br><span class="line">Analyze counted 3 instrs in 1 blocks, through 0 blocks.</span><br><span class="line">$edino positive bundles</span><br><span class="line">assigning %16 to $ebp: BPL [32r,88r:0) 0@32r BPH [32r,88r:0) 0@32r HBP [32r,88r:0) 0@32r</span><br><span class="line"></span><br><span class="line">selectOrSplit GR32:%2 [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r  weight:INF</span><br><span class="line">hints: $eax $ebp</span><br><span class="line">assigning %2 to $eax: AH [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r AL [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r HAX [368r,400r:2)[400r,432r:0)[432r,448r:1) 0@400r 1@432r 2@368r</span><br><span class="line">Trying to reconcile hints for: %12($ebx)</span><br><span class="line">%12($ebx) is recolorable.</span><br><span class="line">Trying to reconcile hints for: %13($ebp)</span><br><span class="line">%13($ebp) is recolorable.</span><br><span class="line">Trying to reconcile hints for: %16($ebp)</span><br><span class="line">%16($ebp) is recolorable.</span><br><span class="line">********** REWRITE VIRTUAL REGISTERS **********</span><br><span class="line">********** Function: bar</span><br><span class="line">********** REGISTER MAP **********</span><br><span class="line">[%2 -&gt; $eax] GR32</span><br><span class="line">[%12 -&gt; $ebx] GR32</span><br><span class="line">[%13 -&gt; $ebp] GR32</span><br><span class="line">[%15 -&gt; $edi] GR32</span><br><span class="line">[%16 -&gt; $ebp] GR32</span><br><span class="line"></span><br><span class="line">0Bbb.0.entry:</span><br><span class="line">  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)</span><br><span class="line">  liveins: $edi, $esi</span><br><span class="line">16B  %12:gr32 = COPY $esi</span><br><span class="line">32B  %16:gr32 = COPY $edi</span><br><span class="line">48B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">64B  $edi = COPY %16:gr32, debug-location !25; example.c:0</span><br><span class="line">80B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !25; example.c:0</span><br><span class="line">88B  %15:gr32 = COPY killed %16:gr32</span><br><span class="line">96B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">112B  %13:gr32 = COPY $eax, debug-location !25; example.c:0</span><br><span class="line">128B  CMP32rr %15:gr32, %12:gr32, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">144B  JCC_1 %bb.2, 14, implicit killed $eflags, debug-location !26; example.c:51:9</span><br><span class="line">160B  JMP_1 %bb.1, debug-location !26; example.c:51:9</span><br><span class="line">&gt; renamable $ebx = COPY $esi</span><br><span class="line">&gt; renamable $ebp = COPY $edi</span><br><span class="line">&gt; ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">&gt; $edi = COPY renamable $ebp, debug-location !25; example.c:0</span><br><span class="line">&gt; CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !25; example.c:0</span><br><span class="line">&gt; renamable $edi = COPY killed renamable $ebp</span><br><span class="line">&gt; ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !25; example.c:0</span><br><span class="line">&gt; renamable $ebp = COPY $eax, debug-location !25; example.c:0</span><br><span class="line">&gt; CMP32rr renamable $edi, renamable $ebx, implicit-def $eflags, debug-location !23; example.c:51:9</span><br><span class="line">&gt; JCC_1 %bb.2, 14, implicit killed $eflags, debug-location !26; example.c:51:9</span><br><span class="line">&gt; JMP_1 %bb.1, debug-location !26; example.c:51:9</span><br><span class="line">176Bbb.1.if.then:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line">  liveins: $ebp, $ebx, $edi</span><br><span class="line">208B  %15:gr32 = nsw IMUL32rr killed %15:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">240B  %13:gr32 = nsw ADD32rr killed %13:gr32(tied-def 0), killed %15:gr32, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">272B  JMP_1 %bb.3, debug-location !30; example.c:54:5</span><br><span class="line">&gt; renamable $edi = nsw IMUL32rr killed renamable $edi(tied-def 0), renamable $ebx, implicit-def dead $eflags, debug-location !27; example.c:52:23</span><br><span class="line">&gt; renamable $ebp = nsw ADD32rr killed renamable $ebp(tied-def 0), killed renamable $edi, implicit-def dead $eflags, debug-instr-number 2, debug-location !29; example.c:52:19</span><br><span class="line">&gt; JMP_1 %bb.3, debug-location !30; example.c:54:5</span><br><span class="line">288Bbb.2.if.else:</span><br><span class="line">; predecessors: %bb.0</span><br><span class="line">  successors: %bb.3(0x80000000); %bb.3(100.00%)</span><br><span class="line">  liveins: $ebp, $ebx, $edi</span><br><span class="line">304B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">320B  $edi = COPY killed %15:gr32, debug-location !31; example.c:55:20</span><br><span class="line">336B  CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !31; example.c:55:20</span><br><span class="line">352B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">368B  %2:gr32 = COPY $eax, debug-location !31; example.c:55:20</span><br><span class="line">400B  %2:gr32 = nsw IMUL32rr killed %2:gr32(tied-def 0), killed %13:gr32, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">432B  %2:gr32 = nsw ADD32rr killed %2:gr32(tied-def 0), %12:gr32, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">448B  %13:gr32 = COPY killed %2:gr32</span><br><span class="line">&gt; ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">&gt; $edi = COPY killed renamable $edi, debug-location !31; example.c:55:20</span><br><span class="line">Identity copy: $edi = COPY killed renamable $edi, debug-location !31; example.c:55:20</span><br><span class="line">  deleted.</span><br><span class="line">&gt; CALL64pcrel32 target-flags(x86-plt) @foo, &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit $edi, implicit-def $rsp, implicit-def $ssp, implicit-def $eax, debug-location !31; example.c:55:20</span><br><span class="line">&gt; ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp, debug-location !31; example.c:55:20</span><br><span class="line">&gt; renamable $eax = COPY $eax, debug-location !31; example.c:55:20</span><br><span class="line">Identity copy: renamable $eax = COPY $eax, debug-location !31; example.c:55:20</span><br><span class="line">  deleted.</span><br><span class="line">&gt; renamable $eax = nsw IMUL32rr killed renamable $eax(tied-def 0), killed renamable $ebp, implicit-def dead $eflags, debug-location !33; example.c:55:19</span><br><span class="line">&gt; renamable $eax = nsw ADD32rr killed renamable $eax(tied-def 0), renamable $ebx, implicit-def dead $eflags, debug-instr-number 1, debug-location !34; example.c:55:27</span><br><span class="line">&gt; renamable $ebp = COPY killed renamable $eax</span><br><span class="line">464Bbb.3.if.end:</span><br><span class="line">; predecessors: %bb.2, %bb.1</span><br><span class="line">  liveins: $ebp, $ebx</span><br><span class="line">512B  %12:gr32 = nsw SUB32rr killed %12:gr32(tied-def 0), %13:gr32, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">544B  %12:gr32 = nsw IMUL32rr killed %12:gr32(tied-def 0), killed %13:gr32, implicit-def dead $eflags, debug-location !36; example.c:57:13</span><br><span class="line">560B  $eax = COPY killed %12:gr32, debug-location !37; example.c:57:5</span><br><span class="line">576B  RET 0, $eax, debug-location !37; example.c:57:5</span><br><span class="line">&gt; renamable $ebx = nsw SUB32rr killed renamable $ebx(tied-def 0), renamable $ebp, implicit-def dead $eflags, debug-location !35; example.c:57:16</span><br><span class="line">&gt; renamable $ebx = nsw IMUL32rr killed renamable $ebx(tied-def 0), killed renamable $ebp, implicit-def dead $eflags, debug-location !36; example.c:57:13</span><br><span class="line">&gt; $eax = COPY killed renamable $ebx, debug-location !37; example.c:57:5</span><br><span class="line">&gt; RET 0, $eax, debug-location !37; example.c:57:5</span><br><span class="line">Compiler returned: 0</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>可以看到Vninfo是以ssa形式进行命名的.</p><h2 id="liveness"><a href="#liveness" class="headerlink" title="liveness"></a>liveness</h2><p>先看有哪些数据结构</p><ol><li>vninfo, SlotIndex</li><li>Segment, LiveRange, LiveInterval, LiveRangeUpdater</li><li>LiveRangeCalc, LiveIntervalCalc</li></ol><p>接下来依次说明.</p><p>VNInfo,给define标个号. 每个虚拟变量之间标号是独立的. 比如:</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">%1 [224r,240r:0)[240r,256r:1) 0@224r 1@240r  weight:0.000000e+00</span><br><span class="line">%2 [416r,432r:0)[432r,448r:1) 0@416r 1@432r  weight:0.000000e+00</span><br></pre></td></tr></table></figure><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="comment">// llvm/include/llvm/CodeGen/LiveInterval.h</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">VNInfo</span> &#123;</span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line">    <span class="comment">/// The ID number of this value.</span></span><br><span class="line">    <span class="type">unsigned</span> id;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// The index of the defining instruction.</span></span><br><span class="line">    SlotIndex def;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Copy from the parameter into this VNInfo.</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">copyFrom</span><span class="params">(VNInfo &amp;src)</span> </span>&#123;</span><br><span class="line">      def = src.def;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Returns true if this value is defined by a PHI instruction (or was,</span></span><br><span class="line">    <span class="comment">/// PHI instructions may have been eliminated).</span></span><br><span class="line">    <span class="comment">/// PHI-defs begin at a block boundary, all other defs begin at register or</span></span><br><span class="line">    <span class="comment">/// EC slots.</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">isPHIDef</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> def.<span class="built_in">isBlock</span>(); &#125;</span><br><span class="line"></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="comment">// llvm/CodeGen/SlotIndexes.h</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">SlotIndex</span> &#123;</span><br><span class="line">    <span class="keyword">enum</span> <span class="title class_">Slot</span> &#123;</span><br><span class="line">    <span class="comment">/// Basic block boundary.  Used for live ranges entering and leaving a</span></span><br><span class="line">    <span class="comment">/// block without being live in the layout neighbor.  Also used as the</span></span><br><span class="line">    <span class="comment">/// def slot of PHI-defs.</span></span><br><span class="line">    Slot_Block,</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Early-clobber register use/def slot.  A live range defined at</span></span><br><span class="line">    <span class="comment">/// Slot_EarlyClobber interferes with normal live ranges killed at</span></span><br><span class="line">    <span class="comment">/// Slot_Register.  Also used as the kill slot for live ranges tied to an</span></span><br><span class="line">    <span class="comment">/// early-clobber def.</span></span><br><span class="line">    Slot_EarlyClobber,</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Normal register use/def slot.  Normal instructions kill and define</span></span><br><span class="line">    <span class="comment">/// register live ranges at this slot.</span></span><br><span class="line">    Slot_Register,</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Dead def kill point.  Kill slot for a live range that is defined by</span></span><br><span class="line">    <span class="comment">/// the same instruction (Slot_Register or Slot_EarlyClobber), but isn&#x27;t</span></span><br><span class="line">    <span class="comment">/// used anywhere.</span></span><br><span class="line">    Slot_Dead,</span><br><span class="line"></span><br><span class="line">    Slot_Count</span><br><span class="line">    &#125;;</span><br><span class="line"></span><br><span class="line">    </span><br><span class="line"></span><br><span class="line">    PointerIntPair&lt;IndexListEntry*, <span class="number">2</span>, <span class="type">unsigned</span>&gt; lie;  <span class="comment">// 复用了低2bits放置Slot</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">IndexListEntry</span> : <span class="keyword">public</span> ilist_node&lt;IndexListEntry&gt; &#123;</span><br><span class="line">    MachineInstr *mi;</span><br><span class="line">    <span class="type">unsigned</span> index;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// slot index管理类， 函数级别</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">SlotIndexes</span>&#123;</span><br><span class="line">  <span class="keyword">using</span> IndexList = simple_ilist&lt;IndexListEntry&gt;;</span><br><span class="line">  IndexList indexList; </span><br><span class="line">  DenseMap&lt;<span class="type">const</span> MachineInstr *, SlotIndex&gt; mi2iMap;</span><br><span class="line">  SmallVector&lt;std::pair&lt;SlotIndex, SlotIndex&gt;, <span class="number">8</span>&gt; MBBRanges;  <span class="comment">/// MBBRanges - Map MBB number to (start, stop) indexes.</span></span><br><span class="line">  SmallVector&lt;std::pair&lt;SlotIndex, MachineBasicBlock *&gt;, <span class="number">8</span>&gt; idx2MBBMap; <span class="comment">// sorted list, &lt;first-inst, mbb&gt;</span></span><br><span class="line"></span><br><span class="line">&#125;;</span><br><span class="line"></span><br></pre></td></tr></table></figure><hr><ul><li>segment表示在一个block块内,某个def-use的生命周期</li><li>LiveRange是以ssa形式组织segments的,即一个def对应一个value number, livein&#x2F;phi也是一个def</li><li>LiveInterval是liverange+register</li><li>LiveRangeUpdater: 更新segments时候的缓存</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// [start, end)</span></span><br><span class="line"><span class="comment">// segment表示</span></span><br><span class="line"><span class="comment">// llvm/include/llvm/CodeGen/LiveInterval.h</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">Segment</span> &#123;</span><br><span class="line">    SlotIndex start;  <span class="comment">// Start point of the interval (inclusive)</span></span><br><span class="line">    SlotIndex end;    <span class="comment">// End point of the interval (exclusive)</span></span><br><span class="line">    VNInfo *valno = <span class="literal">nullptr</span>; <span class="comment">// identifier for the value contained in this</span></span><br><span class="line">                            <span class="comment">// segment.</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Return true if the index is covered by this segment.</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">contains</span><span class="params">(SlotIndex I)</span> <span class="type">const</span> </span>&#123;</span><br><span class="line">        <span class="keyword">return</span> start &lt;= I &amp;&amp; I &lt; end;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Return true if the given interval, [S, E), is covered by this segment.</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">containsInterval</span><span class="params">(SlotIndex S, SlotIndex E)</span> <span class="type">const</span> </span>&#123;</span><br><span class="line">        <span class="built_in">assert</span>((S &lt; E) &amp;&amp; <span class="string">&quot;Backwards interval?&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> (start &lt;= S &amp;&amp; S &lt; end) &amp;&amp; (start &lt; E &amp;&amp; E &lt;= end);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">bool</span> <span class="keyword">operator</span>&lt;(<span class="type">const</span> Segment &amp;Other) <span class="type">const</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> std::<span class="built_in">tie</span>(start, end) &lt; std::<span class="built_in">tie</span>(Other.start, Other.end);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="type">bool</span> <span class="keyword">operator</span>==(<span class="type">const</span> Segment &amp;Other) <span class="type">const</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> start == Other.start &amp;&amp; end == Other.end;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">dump</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="comment">/// This class represents the liveness of a register, stack slot, etc.</span></span><br><span class="line"><span class="comment">/// It manages an ordered list of Segment objects.</span></span><br><span class="line"><span class="comment">/// The Segments are organized in a static single assignment form: At places</span></span><br><span class="line"><span class="comment">/// where a new value is defined or different values reach a CFG join a new</span></span><br><span class="line"><span class="comment">/// segment with a new value number is used.</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">LiveRange</span> &#123;</span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line"></span><br><span class="line">    <span class="keyword">using</span> Segments = SmallVector&lt;Segment, <span class="number">2</span>&gt;;</span><br><span class="line">    <span class="keyword">using</span> VNInfoList = SmallVector&lt;VNInfo *, <span class="number">2</span>&gt;;</span><br><span class="line"></span><br><span class="line">    Segments segments;   <span class="comment">// the liveness segments</span></span><br><span class="line">    VNInfoList valnos;   <span class="comment">// value#&#x27;s</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/// Helper class for performant LiveRange bulk updates.</span></span><br><span class="line"><span class="comment">///</span></span><br><span class="line"><span class="comment">/// Calling LiveRange::addSegment() repeatedly can be expensive on large</span></span><br><span class="line"><span class="comment">/// live ranges because segments after the insertion point may need to be</span></span><br><span class="line"><span class="comment">/// shifted. The LiveRangeUpdater class can defer the shifting when adding</span></span><br><span class="line"><span class="comment">/// many segments in order.</span></span><br><span class="line"><span class="comment">///</span></span><br><span class="line"><span class="comment">/// The LiveRange will be in an invalid state until flush() is called.</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">LiveRangeUpdater</span> &#123;</span><br><span class="line">    LiveRange *LR;</span><br><span class="line">    SlotIndex LastStart;</span><br><span class="line">    LiveRange::iterator WriteI;</span><br><span class="line">    LiveRange::iterator ReadI;</span><br><span class="line">    SmallVector&lt;LiveRange::Segment, <span class="number">16</span>&gt; Spills;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">mergeSpills</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line">    <span class="comment">/// Create a LiveRangeUpdater for adding segments to LR.</span></span><br><span class="line">    <span class="comment">/// LR will temporarily be in an invalid state until flush() is called.</span></span><br><span class="line">    <span class="built_in">LiveRangeUpdater</span>(LiveRange *lr = <span class="literal">nullptr</span>) : <span class="built_in">LR</span>(lr) &#123;&#125;</span><br><span class="line"></span><br><span class="line">    ~<span class="built_in">LiveRangeUpdater</span>() &#123; <span class="built_in">flush</span>(); &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Add a segment to LR and coalesce when possible, just like</span></span><br><span class="line">    <span class="comment">/// LR.addSegment(). Segments should be added in increasing start order for</span></span><br><span class="line">    <span class="comment">/// best performance.</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">add</span><span class="params">(LiveRange::Segment)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">add</span><span class="params">(SlotIndex Start, SlotIndex End, VNInfo *VNI)</span> </span>&#123;</span><br><span class="line">      <span class="built_in">add</span>(LiveRange::<span class="built_in">Segment</span>(Start, End, VNI));</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Return true if the LR is currently in an invalid state, and flush()</span></span><br><span class="line">    <span class="comment">/// needs to be called.</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">isDirty</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> LastStart.<span class="built_in">isValid</span>(); &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Flush the updater state to LR so it is valid and contains all added</span></span><br><span class="line">    <span class="comment">/// segments.</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">flush</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Select a different destination live range.</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">setDest</span><span class="params">(LiveRange *lr)</span> </span>&#123;</span><br><span class="line">      <span class="keyword">if</span> (LR != lr &amp;&amp; <span class="built_in">isDirty</span>())</span><br><span class="line">        <span class="built_in">flush</span>();</span><br><span class="line">      LR = lr;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// Get the current destination live range.</span></span><br><span class="line">    <span class="function">LiveRange *<span class="title">getDest</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> LR; &#125;</span><br><span class="line"></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">LiveInterval</span> : <span class="keyword">public</span> LiveRange &#123;</span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line">    <span class="keyword">using</span> super = LiveRange;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/// A live range for subregisters. The LaneMask specifies which parts of the</span></span><br><span class="line">    <span class="comment">/// super register are covered by the interval.</span></span><br><span class="line">    <span class="comment">/// (@sa TargetRegisterInfo::getSubRegIndexLaneMask()).</span></span><br><span class="line">    <span class="keyword">class</span> <span class="title class_">SubRange</span> : <span class="keyword">public</span> LiveRange &#123;</span><br><span class="line">    <span class="keyword">public</span>:</span><br><span class="line">      SubRange *Next = <span class="literal">nullptr</span>;</span><br><span class="line">      LaneBitmask LaneMask; <span class="comment">// subreg</span></span><br><span class="line"></span><br><span class="line">      <span class="comment">/// Constructs a new SubRange object.</span></span><br><span class="line">      <span class="built_in">SubRange</span>(LaneBitmask LaneMask) : <span class="built_in">LaneMask</span>(LaneMask) &#123;&#125;</span><br><span class="line"></span><br><span class="line">      <span class="comment">/// Constructs a new SubRange object by copying liveness from @p Other.</span></span><br><span class="line">      <span class="built_in">SubRange</span>(LaneBitmask LaneMask, <span class="type">const</span> LiveRange &amp;Other,</span><br><span class="line">               BumpPtrAllocator &amp;Allocator)</span><br><span class="line">        : <span class="built_in">LiveRange</span>(Other, Allocator), <span class="built_in">LaneMask</span>(LaneMask) &#123;&#125;</span><br><span class="line"></span><br><span class="line">      <span class="function"><span class="type">void</span> <span class="title">print</span><span class="params">(raw_ostream &amp;OS)</span> <span class="type">const</span></span>;</span><br><span class="line">      <span class="function"><span class="type">void</span> <span class="title">dump</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line">    &#125;;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">private</span>:</span><br><span class="line">    SubRange *SubRanges = <span class="literal">nullptr</span>; <span class="comment">///&lt; Single linked list of subregister live</span></span><br><span class="line">                                   <span class="comment">/// ranges.</span></span><br><span class="line">    <span class="type">const</span> Register Reg; <span class="comment">// the register or stack slot of this interval.</span></span><br><span class="line">    <span class="type">float</span> Weight = <span class="number">0.0</span>; <span class="comment">// weight of this interval</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><hr><ul><li>LiveRangeCalc, LiveIntervalCalc: 计算和查询接口, 采用增量式更新</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="comment">// llvm/CodeGen/LiveRangeCalc.cpp</span></span><br><span class="line">LiveRangeCalc::<span class="built_in">calculateValues</span>()&#123;</span><br><span class="line">    <span class="built_in">updateSSA</span>() <span class="comment">// 依据domtree来迭代至不动点, (为啥用domtree不太懂)</span></span><br><span class="line">    <span class="built_in">updateFromLiveIns</span>() <span class="comment">// livein --&gt; LiveRangeUpdater::add</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure>]]>
    </content>
    <id>http://example.com/2026/03/22/llvm/regalloc_llvm0/</id>
    <link href="http://example.com/2026/03/22/llvm/regalloc_llvm0/"/>
    <published>2026-03-22T00:00:00.000Z</published>
    <summary>
      <![CDATA[<h1 id="依赖的分析"><a href="#依赖的分析" class="headerlink" title="依赖的分析"></a>依赖的分析</h1><ul>
<li>value number</li>
<li>live range(liveness). 这块挺复杂的.<]]>
    </summary>
    <title>llvm寄存器分配0</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>感觉图片太多太麻烦了，想尽量在text格式下写。</p><p>web图形处理上面有几种解决方案:</p><ol><li>用ascii绘制，有点太累了</li><li>markdown提供的mermaid和graphviz。mermaid简单但是功能有限，graphviz功能强大。</li><li>TikZ，更加专业，<a href="https://github.com/kisonecat/tikzjax">https://github.com/kisonecat/tikzjax</a> <a href="https://github.com/jhuix-js/tikzjax">https://github.com/jhuix-js/tikzjax</a></li><li>base64嵌入，体积会比较大。</li></ol><p>关键是文本表示是在框架下已经生成为static&#x2F;<em>.svg,</em>.png，还是在浏览器里渲染。</p>]]>
    </content>
    <id>http://example.com/2026/03/18/allintext/</id>
    <link href="http://example.com/2026/03/18/allintext/"/>
    <published>2026-03-18T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>感觉图片太多太麻烦了，想尽量在text格式下写。</p>
<p>web图形处理上面有几种解决方案:</p>
<ol>
<li>用ascii绘制，有点太累了</li>
<li>markdown提供的mermaid和graphviz。mermaid简单但是功能有限，graphv]]>
    </summary>
    <title>博客尽量用text格式</title>
    <updated>2026-04-28T13:16:41.644Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>构建llvm</p><p>以win11+git bash为例</p><ol><li>配置</li></ol><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">set</span> -x</span><br><span class="line"></span><br><span class="line"><span class="built_in">export</span> COMPILER_DIR=D:/LLVM/bin/ <span class="comment"># 替换为一个已有compiler的path</span></span><br><span class="line"><span class="built_in">export</span> CC=<span class="variable">$COMPILER_DIR</span>/clang.exe</span><br><span class="line"><span class="built_in">export</span> CXX=<span class="variable">$COMPILER_DIR</span>/clang++.exe</span><br><span class="line"><span class="built_in">export</span> RC_COMPILER=<span class="variable">$LLVM_DIR</span>/llvm-rc.exe</span><br><span class="line"></span><br><span class="line"><span class="built_in">mkdir</span> -p build</span><br><span class="line"><span class="built_in">mkdir</span> -p install</span><br><span class="line"></span><br><span class="line">cmake -S ./llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=DEBUG -DLLVM_ENABLE_PROJECTS=<span class="string">&quot;clang&quot;</span> -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_RC_COMPILER=<span class="variable">$RC_COMPILER</span></span><br></pre></td></tr></table></figure><ol start="2"><li>构建，ninja<br>一个线程最少4G内存。4线程16G不一定够用。链接很慢。</li></ol>]]>
    </content>
    <id>http://example.com/2026/03/15/llvm/llvm_build/</id>
    <link href="http://example.com/2026/03/15/llvm/llvm_build/"/>
    <published>2026-03-15T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>构建llvm</p>
<p>以win11+git bash为例</p>
<ol>
<li>配置</li>
</ol>
<figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="l]]>
    </summary>
    <title>win构建llvm</title>
    <updated>2026-04-28T13:16:41.647Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>记录下这一年半编译器开发的经验。</p><ol><li>编译通过，但是运行结果错误。</li></ol><p>如何检查？</p><ul><li>先减小问题规模，整出一个最小可复现用例</li><li>检查pipeline，parser结果，ir pass结果，isel结果，reg alloc，stack frame等。</li><li>verify的重要性，每个不同格式的中间表示转换处需要有verify，保证其流入和流出结果的合法性。</li><li>重复上述步骤，直到问题解决。</li></ul><p>既然是pipeline模型，那么衔接需要处理好。需重点注意。</p><ol start="2"><li>编译器挂掉了，没其他信息。一般是各种空指针解引用导致的，linux里面很好处理。</li><li>ISA里面还是需要cmp a,b; cset; jmpcc这样的指令。block数量可以少一些，虽然有隐式状态吧。</li><li>phi-node-elimination，简单解法很有效。</li><li>将合法化和优化分隔开。写在一起很难调试。比如说上面的phi消除，用phi-node-elimination + register-coalescer处理合法性和优化问题。</li><li>阶段性测试，必不可少，花费时间写测试很有必要，也是约束项目复杂度的关键措施。想想一命通关的难度。</li><li>ir层级表示，4到5层是极限了，每层其实都是一个小解释器。llvm有ast，llvm-ir，mir+MC。算上预处理器就4层了，MC是tablegen送的. 每层都要测试，有自己的结构，层之间的转换。</li><li>小即是多，启动目标要尽可能小，而不是摊大饼。多一个环节，复杂度就会上升一级。</li><li>关于ai，因为经济问题，只能用免费版本的。效果其实不错。如果还是传统的需求分析，设计，编码，测试流程，ai挺有用的。编码还需要盯着具体实现，最好是划定一个小范围，每步都有git commit。git真是ai code的大爹。</li></ol><hr><p>还有编译器后端开发一些经验</p><ol><li><p>phi节点处理</p><ul><li>isel：需要谨慎选择block遍历顺序，RPO是个好选择</li><li>legalize-type：更推荐增量式更新</li><li>phi-elim：也是增量式，冗余copy之后让copy colaescing处理</li></ul></li><li><p>legalize type需要在legalize opcode之前。因为机器支持的opcode+类型是确定的。<br>比如在32位机器上运行64位乘法，需要拆成mul-low，mul-high，然后再处理mul32的合法性。</p></li><li><p>stack frame layout中需要仔细检查下偏移量。多测试下</p></li><li><p>寄存器分配，看成一个二维图着色问题。一个是变量数目，一个是指令数量（index）</p></li></ol><p>reg0: [10, 14]<br>reg1: [2, 12]</p><p>考虑不同block块的频率不一样，视为频率热力图，greedy算法里面踢出某个块就更容易理解。</p><p>图着色坏在不好调试（相对）。</p><hr>]]>
    </content>
    <id>http://example.com/2026/03/12/diary/2026-3-12/</id>
    <link href="http://example.com/2026/03/12/diary/2026-3-12/"/>
    <published>2026-03-12T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>记录下这一年半编译器开发的经验。</p>
<ol>
<li>编译通过，但是运行结果错误。</li>
</ol>
<p>如何检查？</p>
<ul>
<li>先减小问题规模，整出一个最小可复现用例</li>
<li>检查pipeline，parser结果，ir pass结果，]]>
    </summary>
    <title>2026-3-12记</title>
    <updated>2026-04-28T13:16:41.644Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<h1 id="LLVM-指令调度"><a href="#LLVM-指令调度" class="headerlink" title="LLVM 指令调度"></a>LLVM 指令调度</h1><p>有作用在SelectionDAG和MachineInstr的指令调度。并将调度策略和调度框架分离。</p><h2 id="结构"><a href="#结构" class="headerlink" title="结构"></a>结构</h2><ol><li>SDep代表一个依赖关系</li><li>SUnit代表一个基本单元。</li><li>ScheduleDAG，调度DAG的基类。<a href="https://llvm.org/doxygen/classllvm_1_1ScheduleDAG.html">https://llvm.org/doxygen/classllvm_1_1ScheduleDAG.html</a></li></ol><h2 id="ScheduleDAGSDNodes"><a href="#ScheduleDAGSDNodes" class="headerlink" title="ScheduleDAGSDNodes"></a>ScheduleDAGSDNodes</h2><p>作用在SelectionDAG上的。<br>主要在SelectionDAGISel::CodeGenAndEmitDAG()中使用。</p><h2 id="MachineScheduler"><a href="#MachineScheduler" class="headerlink" title="MachineScheduler"></a>MachineScheduler</h2><blockquote><p>作用域：MachineBasicBlock。</p></blockquote><p>以AArch64 O2下的pipeline为例。</p><blockquote><p>Pass Arguments:  -tti -targetlibinfo -assumption-cache-tracker -targetpassconfig -machinemoduleinfo -profile-summary-info -tbaa -scoped-noalias-aa -collector-metadata -machine-branch-prob -regalloc-evict -regalloc-priority -domtree -basic-aa -aa -objc-arc-contract -pre-isel-intrinsic-lowering -expand-large-div-rem -expand-large-fp-convert -atomic-expand -simplifycfg -domtree -loops -loop-simplify -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -scalar-evolution -loop-data-prefetch -aarch64-falkor-hwpf-fix -basic-aa -loop-simplify -canon-freeze -iv-users -loop-reduce -basic-aa -aa -mergeicmps -loops -lazy-branch-prob -lazy-block-freq -expand-memcmp -gc-lowering -shadow-stack-gc-lowering -lower-constant-intrinsics -unreachableblockelim -loops -postdomtree -branch-prob -block-freq -consthoist -replace-with-veclib -partially-inline-libcalls -expandvp -post-inline-ee-instrument -scalarize-masked-mem-intrin -expand-reductions -loops -tlshoist -aarch64-globals-tagging -stack-safety -domtree -basic-aa -aa -aarch64-stack-tagging -complex-deinterleaving -aa -memoryssa -interleaved-load-combine -domtree -interleaved-access -aarch64-sme-abi -domtree -loops -type-promotion -codegenprepare -domtree -dwarf-eh-prepare -aarch64-promote-const -global-merge -callbrprepare -safe-stack -stack-protector -domtree -basic-aa -aa -loops -postdomtree -branch-prob -debug-ata -lazy-branch-prob -lazy-block-freq -aarch64-isel -machinedomtree -aarch64-local-dynamic-tls-cleanup -finalize-isel -lazy-machine-block-freq -early-tailduplication -opt-phis -slotindexes -stack-coloring -localstackalloc -dead-mi-elimination -machinedomtree -aarch64-condopt -machine-loops -machine-trace-metrics -aarch64-ccmp -lazy-machine-block-freq -machine-combiner -aarch64-cond-br-tuning -machine-trace-metrics -early-ifcvt -aarch64-stp-suppress -aarch64-simdinstr-opt -aarch64-stack-tagging-pre-ra -machinedomtree -machine-loops -machine-block-freq -early-machinelicm -machinedomtree -machine-block-freq -machine-cse -machinepostdomtree -machine-cycles -machine-sink -peephole-opt -dead-mi-elimination -aarch64-mi-peephole-opt -aarch64-dead-defs -detect-dead-lanes -init-undef -processimpdefs -unreachable-mbb-elimination -livevars -phi-node-elimination -twoaddressinstruction -machinedomtree -slotindexes -liveintervals -register-coalescer -rename-independent-subregs -machine-scheduler -aarch64-post-coalescer-pass -machine-block-freq -livedebugvars -livestacks -virtregmap -liveregmatrix -edge-bundles -spill-code-placement -lazy-machine-block-freq -machine-opt-remark-emitter -greedy -virtregrewriter -regallocscoringpass -stack-slot-coloring -machine-cp -machinelicm -aarch64-copyelim -aarch64-a57-fp-load-balancing -removeredundantdebugvalues -fixup-statepoint-caller-saved -postra-machine-sink -machinedomtree -machine-loops -machine-block-freq -machinepostdomtree -lazy-machine-block-freq -machine-opt-remark-emitter -shrink-wrap -prologepilog -machine-latecleanup -branch-folder -lazy-machine-block-freq -tailduplication -machine-cp -postrapseudos -aarch64-expand-pseudo -aarch64-ldst-opt -kcfi -aarch64-speculation-hardening -machinedomtree -machine-loops -aarch64-falkor-hwpf-fix-late -postmisched -gc-analysis -machine-block-freq -machinepostdomtree -block-placement -fentry-insert -xray-instrumentation -patchable-function -aarch64-fix-cortex-a53-835769-pass -funclet-layout -stackmap-liveness -livedebugvalues -machine-sanmd -machine-outliner -aarch64-sls-hardening -aarch64-ptrauth -aarch64-branch-targets -branch-relaxation -aarch64-jump-tables -cfi-fixup -lazy-machine-block-freq -machine-opt-remark-emitter -stack-frame-layout -unpack-mi-bundles -lazy-machine-block-freq -machine-opt-remark-emitter<br>只有一个machine-scheduler，源码在llvm&#x2F;lib&#x2F;CodeGen&#x2F;MachineScheduler.cpp</p></blockquote><p>将Block划分为SchedRegion，每个SchedRegion对应N条MachineInstr，然后在Region中有scheduler进行指令调度。</p><p>而AArch64后端中使用createPostMachineScheduler创建了AArch64PostRASchedStrategy作为scheduler。</p><blockquote><p>其继承体系如下：<a href="https://llvm.org/doxygen/classllvm_1_1MachineSchedStrategy.html">https://llvm.org/doxygen/classllvm_1_1MachineSchedStrategy.html</a></p></blockquote><blockquote><p>好复杂的算法和实现。但奇怪的是之前测试CPU2017，各种调度算法影响不大。当然了设置sched.td还是必要的。</p></blockquote>]]>
    </content>
    <id>http://example.com/2026/03/10/llvm/llvm_schedule/</id>
    <link href="http://example.com/2026/03/10/llvm/llvm_schedule/"/>
    <published>2026-03-10T00:00:00.000Z</published>
    <summary>
      <![CDATA[<h1 id="LLVM-指令调度"><a href="#LLVM-指令调度" class="headerlink" title="LLVM 指令调度"></a>LLVM 指令调度</h1><p>有作用在SelectionDAG和MachineInstr的指令调度。并将调度策略和]]>
    </summary>
    <title>LLVM Inst Schedule</title>
    <updated>2026-04-28T13:16:41.647Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<blockquote><p>寄存器分配简介</p></blockquote><h1 id="前置技术-Liveness-Analysis，-Live-Interval-reaching-define"><a href="#前置技术-Liveness-Analysis，-Live-Interval-reaching-define" class="headerlink" title="前置技术 Liveness Analysis， Live Interval, reaching define"></a>前置技术 Liveness Analysis， Live Interval, reaching define</h1><h2 id="Liveness-Analysis"><a href="#Liveness-Analysis" class="headerlink" title="Liveness Analysis"></a>Liveness Analysis</h2><p>经典的数据流分析, 后向(倒序)分析.</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">live_out[b] = ⋃ (live_in[s] for s ∈ succ(b))</span><br><span class="line">live_in[b]  = use[b] ∪ (live_out[b] − def[b])</span><br><span class="line"></span><br></pre></td></tr></table></figure><h2 id="live-interval"><a href="#live-interval" class="headerlink" title="live interval"></a>live interval</h2><p>llvm依赖这个，就是 def-use 分段.</p><h2 id="reaching-define-到达定值"><a href="#reaching-define-到达定值" class="headerlink" title="reaching define 到达定值"></a>reaching define 到达定值</h2><p>就是某个定义点可以到达什么地方，前向分析。<br>实现上可以用bitset或者set.</p><p>图着色会用到。每次define就生成一个新的ID。</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">b0:</span><br><span class="line">  v1 = ...</span><br><span class="line">  br b2</span><br><span class="line"></span><br><span class="line">b1:</span><br><span class="line">  v1 = ... </span><br><span class="line">  br b2</span><br><span class="line"></span><br><span class="line">b2:</span><br><span class="line">  print(v1)</span><br><span class="line"></span><br><span class="line"></span><br></pre></td></tr></table></figure><p>对其进行def标号后就是</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">b0:</span><br><span class="line">  v1 = ... ; &#123;d1:v1, &#125; bitset= &lt;10&gt;</span><br><span class="line">  br b2</span><br><span class="line"></span><br><span class="line">b1:</span><br><span class="line">  v1 = ... ; &#123;d2:v1, &#125; bitset= &lt;01&gt;</span><br><span class="line">  br b2</span><br><span class="line"></span><br><span class="line">b2:</span><br><span class="line">  print(v1)  ; &#123;d1:v1, d2:v1&#125; bitset= &lt;11&gt;</span><br><span class="line"></span><br></pre></td></tr></table></figure><hr><p>首先, 下面介绍的寄存器分配针对的IR都是非SSA的.<br>基于SSA来实现的寄存器分配, emm, 没见过.<br>考虑以下问题: call中abi定义, two address inst, flag的隐式定义.</p><h1 id="全局的图着色"><a href="#全局的图着色" class="headerlink" title="全局的图着色"></a>全局的图着色</h1><p>参考”高级编译器设计与实现” 16章.</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">fn alloc_reg()&#123;</span><br><span class="line">    bool success = false;</span><br><span class="line">    do&#123;</span><br><span class="line">        bool coalesce = false;</span><br><span class="line">        do&#123;</span><br><span class="line">            make_web()</span><br><span class="line">            build_adj_matrix()</span><br><span class="line">            coalesce = coalesce_regs()</span><br><span class="line">        &#125;while(!coalesce);</span><br><span class="line"></span><br><span class="line">        build_adj_list()</span><br><span class="line">        compute_spill_costs()</span><br><span class="line">        prune_graph()</span><br><span class="line">        success = assign_regs()</span><br><span class="line">        if(success)&#123;</span><br><span class="line">            modify_code()</span><br><span class="line">        &#125;else&#123;</span><br><span class="line">            gen_spill_code()</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">    &#125;while(!success);</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>迭代到不动点. </p><h2 id="1-make-web"><a href="#1-make-web" class="headerlink" title="1. make web"></a>1. make web</h2><p>web构造有两种方式</p><ol><li>基于 reaching def的结果来构造. <ul><li>先构造duchain</li><li>基于duchain构造web</li></ul></li><li>直接构造,每个虚拟寄存器就是一个web节点</li></ol><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">b0:</span><br><span class="line">2:  v1 = ...</span><br><span class="line">4:  br b2</span><br><span class="line"></span><br><span class="line">b1:</span><br><span class="line">6:  v1 = ... </span><br><span class="line">8:  br b2</span><br><span class="line"></span><br><span class="line">b2:</span><br><span class="line">10:  print(v1)</span><br><span class="line">12:  v1 = ...</span><br><span class="line">14:  print(v1)</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>第一种方式: (参杂了live range信息)<br>name, defs,  uses<br>web1, {2, 6}, {10}<br>web2, {12},   {14}</p><p>第二种方式:<br>web1: {2,6,12}, {10, 14}</p><h2 id="2-build-adj-matrix"><a href="#2-build-adj-matrix" class="headerlink" title="2. build_adj_matrix"></a>2. build_adj_matrix</h2><p>算法核心, 使用下三角邻接矩阵.<br>关键是如何判定两个web是否干涉.<br>如果用 Reaching def会有很多假阳性出现. 所以需要liveness来判定.</p><h2 id="3-coalesce-regs"><a href="#3-coalesce-regs" class="headerlink" title="3. coalesce_regs"></a>3. coalesce_regs</h2><p>会尝试消除一些 <code>a = copy b</code>的指令</p><h2 id="4-build-adj-list"><a href="#4-build-adj-list" class="headerlink" title="4. build_adj_list"></a>4. build_adj_list</h2><p>基于邻接表就行</p><h2 id="5-compute-spill-costs"><a href="#5-compute-spill-costs" class="headerlink" title="5. compute_spill_costs"></a>5. compute_spill_costs</h2><p>启发式算法</p><ol><li>loop depth</li><li>block frequency</li><li>num of uses</li></ol><h2 id="6-prune-graph"><a href="#6-prune-graph" class="headerlink" title="6. prune_graph"></a>6. prune_graph</h2><p>算法的核心. 这一步是将web构成的冲突图排列到一个栈里面, 供<code>assign_regs</code>使用.<br>有<code>&lt;R</code>的乐观算法, 选择 <code>&lt;R</code>的节点开始染色. 或者用乐观的启发式算法,删除<code>度&gt;=R</code>的节点推广 <code>&lt;R</code>.</p><h2 id="7-assign-regs"><a href="#7-assign-regs" class="headerlink" title="7. assign_regs"></a>7. assign_regs</h2><p>给web赋值颜色. 不是真正的修改指令.</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">success: bool = true</span><br><span class="line">while !stack.empty()</span><br><span class="line">    web = stack.pop()</span><br><span class="line">    c = min_color(web)</span><br><span class="line">    if c &gt; 0</span><br><span class="line">        adjList[web].color = c</span><br><span class="line">    else</span><br><span class="line">        adjList[web].spill = true</span><br><span class="line">        success = false</span><br><span class="line"></span><br><span class="line">return success</span><br></pre></td></tr></table></figure><h2 id="8-modify-code"><a href="#8-modify-code" class="headerlink" title="8. modify code"></a>8. modify code</h2><p>就是很直接的重写</p><h2 id="9-gen-spill-code"><a href="#9-gen-spill-code" class="headerlink" title="9. gen_spill_code"></a>9. gen_spill_code</h2><p>简单来说在def 后生成 store, 在use之前生成load.<br>注意, 图着色这里是针对所有已经标记为 spill的web进行溢出.</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>不难看出图着色还是比较耗时的. reaching define的计算, adj matrix的计算, 再加上迭代到不动点.<br>生成的代码质量上, 对live range建模不太完善, spill判定也比较简单.</p><h1 id="linear-scan"><a href="#linear-scan" class="headerlink" title="linear scan"></a>linear scan</h1><p>从live range出发看待寄存器分配问题. 参考Linear scan register allocation.</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">reg_alloc()&#123;</span><br><span class="line">  do_instruction_order()</span><br><span class="line">  l =  do_live_intervals_calc()</span><br><span class="line">  l.sort_by_start_point()</span><br><span class="line">  linear_scan(l)</span><br><span class="line"></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">linear_scan(live_intervals l)&#123;</span><br><span class="line">  active = []</span><br><span class="line">  for live_interval i : l&#123;</span><br><span class="line">    expireOld(i)</span><br><span class="line">    if active.size() == R &#123;</span><br><span class="line">      spillAt(i)</span><br><span class="line">    &#125;else&#123;</span><br><span class="line">      register[i] = free_register.pop()</span><br><span class="line">      active.push(i)</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">expireOld(i)&#123;</span><br><span class="line">  for live_interval j in ( active in order of increasing end point ) &#123;</span><br><span class="line">    if j.end &gt;= i.start &#123;</span><br><span class="line">      return </span><br><span class="line">    &#125;</span><br><span class="line">    remove j from active</span><br><span class="line">    free_register.push(add j.register)</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">spillAt(i)&#123; // 启发式算法</span><br><span class="line">  spill = active.last()</span><br><span class="line">  if spill.end &gt; i.end &#123;</span><br><span class="line">    i.register = spill.register</span><br><span class="line">    spill.location = new stack location</span><br><span class="line"></span><br><span class="line">    active.remove(spill)</span><br><span class="line"></span><br><span class="line">    active.push(i)</span><br><span class="line">    active.sort_by_end()</span><br><span class="line">  &#125;else&#123;</span><br><span class="line">    i.location = new stack location</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><ol><li>instruction order 不影响结果正确性,可能影响代码质量</li><li>active最大长度是R</li><li>spill的依据是启发式算法, 论文里面就是根据live range长度进行判定</li></ol><p>改进点:</p><ol><li>live intervals太宽泛,可以利用某些空洞</li><li>spill计算太简陋, 可以结合更多信息</li><li>copy coalescing, 尝试删除<code>a = copy b</code>这样的复制指令</li></ol><h1 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h1><ol><li>高级编译器设计与实现</li><li><a href="https://www.cnblogs.com/AANA/p/16315859.html">https://www.cnblogs.com/AANA/p/16315859.html</a></li><li><a href="https://www.cnblogs.com/hsyluxiaoguo/p/18902335">https://www.cnblogs.com/hsyluxiaoguo/p/18902335</a></li><li>Massimiliano Poletto and Vivek Sarkar. 1999. Linear scan register allocation. ACM Trans. Program. Lang. Syst. 21, 5 (Sept. 1999), 895–913. <a href="https://doi.org/10.1145/330249.330250">https://doi.org/10.1145/330249.330250</a>, <a href="https://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf">https://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf</a></li></ol>]]>
    </content>
    <id>http://example.com/2026/02/20/regalloc/</id>
    <link href="http://example.com/2026/02/20/regalloc/"/>
    <published>2026-02-20T00:00:00.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>寄存器分配简介</p>
</blockquote>
<h1 id="前置技术-Liveness-Analysis，-Live-Interval-reaching-define"><a href="#前置技术-Liveness-Analysis，-L]]>
    </summary>
    <title>寄存器分配简介</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<p>循环，重要性不必多说。</p><p>llvm与循环相关的优化有很多：</p><blockquote><ul><li>loop unroll and jam : 循环展开+ 合并</li><li>loop unroll</li><li>SCEV</li><li>loop invariant code motion</li><li>loop interchange</li><li>loop rotation</li><li>loop splitting</li><li>loop fusion</li><li>loop unswitching</li><li>loop vectorization<br>….</li></ul></blockquote><ul><li>从非结构化的CFG中识别loop：</li><li>识别循环迭代变量</li></ul><ol><li>Loop 定义</li></ol><p>A loop in a control flow graph is a set of nodes S including a header node h, with the following properties:</p><ul><li>From any node in S there is a path leading to h </li><li>There is a path from h to any node in S</li><li>There is no edge from any node outside S to any node in S other than h</li></ul><ol><li><p>back edge<br>A control flow graph edge from a node n to a node h that dominates n is called a back edge.</p></li><li><p>natural loop<br>The natural loop of a backedge (n,h), where h dominates n, is<br> • the set of nodes x such that h dominates x and<br> • there is a path from x to n not containing h.<br> The header of this loop will be h<br> Each back-edge has a corresponding natural loop</p></li><li><p>nested loop<br>Suppose:<br> – A and B are loops with headers a and b, such that a !&#x3D; b, and b is in A<br>Then<br> – The nodes of B must be a proper subset of the nodes of A<br> – We say that loop B is nested within A<br> – B is the inner loop</p></li></ol><hr><h1 id="loop识别"><a href="#loop识别" class="headerlink" title="loop识别"></a>loop识别</h1><p>假设已经有domtree。<br>根据loop的定义，从遍历domtree开始构建loop。</p><p>用后序方式遍历domtree，即先识别内层loop，然后识别外层loop。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">initLoopInfo</span><span class="params">()</span></span>&#123;</span><br><span class="line">stack = [root];</span><br><span class="line"><span class="keyword">while</span>(stack.<span class="built_in">size</span>())&#123;</span><br><span class="line"></span><br><span class="line"><span class="keyword">auto</span>*top = stack.<span class="built_in">back</span>();</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span>( !<span class="built_in">all_children_visited</span>(top) )&#123;</span><br><span class="line"><span class="keyword">for</span> c in top.<span class="built_in">children</span>()&#123;</span><br><span class="line">stack.<span class="built_in">push</span>(c)</span><br><span class="line">&#125;</span><br><span class="line">&#125;<span class="keyword">else</span>&#123;</span><br><span class="line">stack.<span class="built_in">pop</span>();</span><br><span class="line"></span><br><span class="line"><span class="comment">// 1. find back edge</span></span><br><span class="line">backedgs = []</span><br><span class="line"><span class="keyword">for</span> pred in top.<span class="built_in">getBlock</span>().<span class="built_in">pred</span>()&#123;</span><br><span class="line"><span class="keyword">if</span>( top.<span class="built_in">doms</span>( pred ) )&#123;</span><br><span class="line">backedgs.<span class="built_in">push</span>(pred)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> (pred.<span class="built_in">size</span>())&#123;</span><br><span class="line">loop = <span class="built_in">createLoop</span>()</span><br><span class="line"><span class="built_in">discoverLoop</span>(loop, backedgs)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 有回边，loop header可以查找循环体了。</span></span><br><span class="line"><span class="comment">// 寻找backedges的前驱即可。</span></span><br><span class="line"><span class="type">void</span> <span class="built_in">discoverLoop</span>(loop, backedgs)&#123;</span><br><span class="line"></span><br><span class="line"><span class="keyword">while</span>(backedgs.<span class="built_in">size</span>())&#123;</span><br><span class="line"><span class="keyword">auto</span>* n = backedgs.<span class="built_in">back</span>();</span><br><span class="line">backedgs.<span class="built_in">pop_back</span>();</span><br><span class="line"></span><br><span class="line">sub_loop = info.<span class="built_in">get</span>(n)</span><br><span class="line"><span class="keyword">if</span>(sub_loop == <span class="literal">nullptr</span>)&#123;</span><br><span class="line"><span class="keyword">if</span>(!loop.<span class="built_in">getHeader</span>().<span class="built_in">dom</span>(n))&#123;</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line">loop.<span class="built_in">add</span>(n)</span><br><span class="line"><span class="keyword">for</span> p in n.<span class="built_in">pred</span>()&#123;</span><br><span class="line">backedgs.<span class="built_in">push</span>(p)</span><br><span class="line">&#125;</span><br><span class="line">&#125;<span class="keyword">else</span>&#123;</span><br><span class="line"><span class="comment">// 有内层循环了</span></span><br><span class="line">sub = sub_loop-&gt;<span class="built_in">getOuterMostLoop</span>()</span><br><span class="line">loop.<span class="built_in">add_sub</span>(sub)</span><br><span class="line"></span><br><span class="line">n = sub.<span class="built_in">getHeader</span>()</span><br><span class="line"><span class="comment">// 不处理sub对应的所有block</span></span><br><span class="line"><span class="keyword">for</span> p in n.<span class="built_in">preds</span>()&#123;</span><br><span class="line"><span class="keyword">if</span>( <span class="built_in">getLoop</span>(p) != sub )&#123;</span><br><span class="line">backedgs.<span class="built_in">push</span>(p)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><hr><ul><li><a href="https://www.doc.ic.ac.uk/~phjk/Compilers/Lectures/pdfs/Ch7-part2-DominatorsAndNaturalLoops.pdf">Compilers I Chapter 1: Introduction</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/570886.570887">On loops, dominators, and dominance frontiers (acm.org)</a></li><li><a href="https://www.cs.utexas.edu/~pingali/CS375/2010Sp/lectures/LoopOptimizations.pdf">Microsoft PowerPoint - loopOptimization [Compatibility Mode] (utexas.edu)</a></li></ul>]]>
    </content>
    <id>http://example.com/2025/10/10/loop%E4%BB%8B%E7%BB%8D/</id>
    <link href="http://example.com/2025/10/10/loop%E4%BB%8B%E7%BB%8D/"/>
    <published>2025-10-10T00:00:00.000Z</published>
    <summary>
      <![CDATA[<p>循环，重要性不必多说。</p>
<p>llvm与循环相关的优化有很多：</p>
<blockquote>
<ul>
<li>loop unroll and jam : 循环展开+ 合并</li>
<li>loop unroll</li>
<li>SCEV</li>
<li>]]>
    </summary>
    <title>loop 介绍</title>
    <updated>2026-04-28T13:16:41.648Z</updated>
  </entry>
  <entry>
    <author>
      <name>John Doe</name>
    </author>
    <content>
      <![CDATA[<blockquote><p>phinode的创建和消除</p></blockquote><h1 id="1-SSA构建算法"><a href="#1-SSA构建算法" class="headerlink" title="1. SSA构建算法"></a>1. SSA构建算法</h1><blockquote><p>依赖：</p><ol><li>dom tree</li><li>DF</li><li>llvm ir基础知识</li></ol></blockquote><p>对以下c代码</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> <span class="title function_">foo</span><span class="params">(<span class="type">int</span> a)</span>&#123;</span><br><span class="line"><span class="type">int</span> b = a;</span><br><span class="line"><span class="keyword">return</span> b;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>clang会生成类似这种IR</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">define i32 @foo(i32 %a)&#123;</span><br><span class="line">%pa = alloc i32</span><br><span class="line">%pb = alloc i32</span><br><span class="line">store %a, %pa</span><br><span class="line">store %a, %pb</span><br><span class="line">%b1 = load %pb</span><br><span class="line">ret i32 %b1</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们能看到llvm采用了alloc+load+store指令来实现c中局部变量的存放。<br>这里参数处理<code>i32 %a; store %a, %pa</code>实际上是个比较复杂的话题，按下不表。<br>这段IR的有很多冗余的load&#x2F;store, 有没有什么办法消除掉呢？<br>SROA（mem2reg）！<br>预期的IR</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">define i32 @foo(i32 %a)&#123;</span><br><span class="line">ret i32 %a</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果控制流比较复杂呢？</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> b = a;</span><br><span class="line"><span class="keyword">if</span>(a &gt; <span class="number">10</span>)&#123;</span><br><span class="line">b = b + a;</span><br><span class="line">&#125;<span class="keyword">else</span>&#123;</span><br><span class="line">b = b * a;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> b;</span><br><span class="line">=&gt; </span><br><span class="line">ir</span><br><span class="line">bb0:</span><br><span class="line">  %pb = alloc i32</span><br><span class="line">  store %a, %pb</span><br><span class="line">  %c = CMP_GT %a, <span class="number">10</span></span><br><span class="line">  br %c, %bb1, %bb2</span><br><span class="line"></span><br><span class="line">bb1:</span><br><span class="line">  %b0 = load %pb</span><br><span class="line">  %b1 = add %b0, %a</span><br><span class="line">  store %b1, %pb</span><br><span class="line">  br %bb4</span><br><span class="line"></span><br><span class="line">bb3:</span><br><span class="line">  %b2 = load %pb</span><br><span class="line">  %b3 = mul %b2, %a</span><br><span class="line">  store %b3, %pb</span><br><span class="line">  br %bb4</span><br><span class="line">bb4:</span><br><span class="line">  %b4 = load %pb</span><br><span class="line">  ret %b4</span><br></pre></td></tr></table></figure><p>这里我们发现，如果想要正确删除掉这些load&#x2F;store，我们需要引入<code>phinode</code>,而且必须拿到控制流信息。phinode是个逻辑节点，在实际的CPU上没对应的指令&#x2F;寄存器表示。所以我们在生成实际的汇编前需要再删除掉phi。</p><p>接下来看如何实现SSA构建：</p><p>原理分为两步, insertPHI, Rename， 伪代码如下：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">procedure <span class="title function_">InsertPHI</span><span class="params">(variable_list)</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> each variable v in variable_list</span><br><span class="line">        WorkList ← ∅</span><br><span class="line">        EverOnWorkList ← ∅</span><br><span class="line">        AlreadyHasPhiFunc ← ∅</span><br><span class="line">        <span class="keyword">for</span> each node n containing an assignment to v</span><br><span class="line">            WorkList ← WorkList ∪ &#123;n&#125;</span><br><span class="line">        end</span><br><span class="line"></span><br><span class="line">        EverOnWorkList ← WorkList</span><br><span class="line">        <span class="keyword">while</span> WorkList ≠ ∅</span><br><span class="line">            Remove some node n from WorkList</span><br><span class="line">            <span class="keyword">for</span> each d ∈ DF(n)</span><br><span class="line">                <span class="keyword">if</span> d ∉ AlreadyHasPhiFunc</span><br><span class="line">                    Insert a φ-function <span class="keyword">for</span> v at d</span><br><span class="line">                    AlreadyHasPhiFunc ← AlreadyHasPhiFunc ∪ &#123;d&#125;</span><br><span class="line">                    <span class="keyword">if</span> d ∉ EverOnWorkList</span><br><span class="line">                        WorkList ← WorkList ∪ &#123;d&#125;</span><br><span class="line">                        EverOnWorkList ← EverOnWorkList ∪ &#123;d&#125;</span><br><span class="line">                    end</span><br><span class="line">                end</span><br><span class="line">            end</span><br><span class="line">        end</span><br><span class="line">    end</span><br><span class="line">end procedure</span><br><span class="line"></span><br><span class="line">  </span><br><span class="line"></span><br><span class="line">procedure <span class="title function_">GenName</span><span class="params">(variable v)</span></span><br><span class="line">    vn = new Value(v)</span><br><span class="line">    Push vn onto Stacks[v]</span><br><span class="line">    <span class="keyword">return</span> v</span><br><span class="line">end procedure</span><br><span class="line"></span><br><span class="line">  </span><br><span class="line"></span><br><span class="line">procedure Rename(block b)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> b previously visited <span class="keyword">return</span></span><br><span class="line">    <span class="keyword">for</span> each φ-function p in b</span><br><span class="line">        v = LHS(p)</span><br><span class="line">        vn = GenName(v) and replace v with vn</span><br><span class="line">    end</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> each statement s in b (in order)</span><br><span class="line">        <span class="keyword">for</span> each variable v ∈ RHS(s)</span><br><span class="line">            replace v by Top(Stacks[v])</span><br><span class="line">        end</span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> each variable v ∈ LHS(s)</span><br><span class="line">            vn = GenName(v) and replace v with vn</span><br><span class="line">        end</span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> each s ∈ succ(b) (in CFG)</span><br><span class="line">            j ← position in s’s φ-function corresponding to block b</span><br><span class="line">            <span class="keyword">for</span> each φ-function p in s</span><br><span class="line">                replace the jth operand of RHS(p) by Top(Stacks[v])</span><br><span class="line">            end</span><br><span class="line">        end</span><br><span class="line">    end</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> each s ∈ child(b) (in DT)</span><br><span class="line">        Rename(s)</span><br><span class="line">    end</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> each φ-function or statement t in b</span><br><span class="line">        <span class="keyword">for</span> each vi ∈ LHS(t)</span><br><span class="line">            Pop(Stacks[v])</span><br><span class="line">        end</span><br><span class="line">    end</span><br><span class="line">end procedure</span><br><span class="line"></span><br><span class="line">SSAConstruction:</span><br><span class="line">    InsertPHI(<span class="built_in">list</span>)</span><br><span class="line">    Rename(entry)</span><br><span class="line">end</span><br></pre></td></tr></table></figure><p>我们看到这种伪代码的形式和LLVMIR区别很大。因为IR格式不一样，接下来看llvm ir下如何实现mem2reg</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br></pre></td><td class="code"><pre><span class="line">mem2reg(Function f)&#123;</span><br><span class="line"><span class="built_in">list</span> = []</span><br><span class="line"><span class="keyword">for</span> inst in f.entry()&#123;</span><br><span class="line"><span class="keyword">if</span> inst is alloc and isPromotable(inst)&#123;</span><br><span class="line"><span class="comment">// 有些alloc不能被删除，like volatile</span></span><br><span class="line"><span class="built_in">list</span>.push(inst)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">list</span>.empty() <span class="keyword">return</span></span><br><span class="line"></span><br><span class="line">Phi2Alloc = InsertPHI(<span class="built_in">list</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// init map: alloc -&gt; stack&lt;value&gt;</span></span><br><span class="line"><span class="comment">// with first value undef</span></span><br><span class="line">incomings_map = &#123;&#125; </span><br><span class="line"><span class="keyword">for</span> alloc in <span class="built_in">list</span> &#123;</span><br><span class="line">incomings_map[alloc] = [undef]</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">Rename(f.entry(), incomings_map, Phi2Alloc)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">InsertPHI(<span class="built_in">list</span>)&#123;</span><br><span class="line">Phi2Alloc = &#123;&#125;</span><br><span class="line"><span class="keyword">for</span> alloc in <span class="built_in">list</span>&#123;</span><br><span class="line">WorkList = []</span><br><span class="line"><span class="keyword">for</span> user in alloc.use_list()&#123;</span><br><span class="line"><span class="keyword">if</span> user is store &#123;</span><br><span class="line">WorkList.push(user.parent())</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">EverOnWorkList = WorkList.clone()</span><br><span class="line">AlreadyHasPhi = []</span><br><span class="line"></span><br><span class="line"><span class="keyword">while</span> !WorkList.empty() &#123;</span><br><span class="line">n = WorkList.pop()</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> d in DF(n) &#123;</span><br><span class="line"><span class="keyword">if</span> d not in AlreadyHasPhi &#123;</span><br><span class="line">phi = create_phi()</span><br><span class="line">phi.set_all_incomings(undef)</span><br><span class="line">d.insert_front(phi)</span><br><span class="line">Phi2Alloc[phi] = alloc</span><br><span class="line"></span><br><span class="line">AlreadyHasPhi.add(d)</span><br><span class="line"><span class="keyword">if</span> d not in EverOnWorkList &#123;</span><br><span class="line">WorkList.push(d)</span><br><span class="line">EverOnWorkList.push(d)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> Phi2Alloc</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">PhiRename(bb, incomings_map, Phi2Alloc)&#123;</span><br><span class="line"><span class="keyword">if</span> bb visited, <span class="keyword">return</span></span><br><span class="line"><span class="keyword">for</span> inst in bb &#123;</span><br><span class="line"><span class="keyword">if</span> inst is phi &#123;</span><br><span class="line"><span class="keyword">if</span> inst not in Phi2Alloc</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">alloc = Phi2Alloc[inst]</span><br><span class="line">incomings_map[alloc].push(phi) <span class="comment">// update top</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> inst is load &#123;</span><br><span class="line"><span class="comment">// %inst = load %address</span></span><br><span class="line">ad = inst.get_address();</span><br><span class="line"><span class="keyword">if</span>(ad is alloc) &#123;</span><br><span class="line">top = incomings_map[ad].top();</span><br><span class="line">inst.replaceAllUsesWith(top)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">continue</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> inst is store &#123;</span><br><span class="line">ad = inst.get_address()</span><br><span class="line"><span class="comment">// store %x, %address</span></span><br><span class="line"><span class="keyword">if</span> ad is alloc &#123;</span><br><span class="line"><span class="comment">// update top</span></span><br><span class="line">incomings_map[ad].push(inst.getValue());</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span> succ in b.getSuccessors() &#123; <span class="comment">// cfg</span></span><br><span class="line"><span class="keyword">for</span> phi in succ.getPhiNodes() &#123;</span><br><span class="line">alloc = Phi2Alloc[phi]</span><br><span class="line">top = incomings_map[alloc].top()</span><br><span class="line">phi.update_by_block(b, top) <span class="comment">// update for incoming</span></span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> child in DomTree(b).children() &#123;</span><br><span class="line"><span class="comment">// may cause stack over flow...</span></span><br><span class="line">Rename(child, incomings_map, Phi2Alloc)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// do pop stack</span></span><br><span class="line"><span class="keyword">for</span> inst in b &#123;</span><br><span class="line"><span class="keyword">if</span> inst is phi &#123;</span><br><span class="line">alloc = Phi2Alloc[phi]</span><br><span class="line">incomings_map[alloc].pop()</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> inst is store &#123;</span><br><span class="line">ad = inst.getAddress()</span><br><span class="line">incomings_map[ad].pop()</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>当然，以上代码不是llvm mem2reg实际实现代码。</p><p>llvm实现做了一些优化，domtree遍历优化，特殊情况处理等。</p><hr><h1 id="2-SSA-destruction"><a href="#2-SSA-destruction" class="headerlink" title="2. SSA destruction"></a>2. SSA destruction</h1><p>如何销毁SSA形式，删除phinode。</p><blockquote><p>llvm的phi-elimination是在mir阶段发生。</p></blockquote><p>一般来说，SSA destruction是将phinode替换为copy指令。</p><ol><li>暴力做法，对于<code>p = phi (%a1, %bb1), (%a2, %bb2) ....</code>，在bb1,bb2… 尾部插入 <code>p = copy %a...</code> 然后删除掉phi节点。但是该做法结果不对。</li></ol><p>考虑到lost copy problem和swap problem：</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">// Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency</span><br><span class="line">// https://inria.hal.science/inria-00349925v1/document</span><br><span class="line"></span><br><span class="line">lost copy problem:</span><br><span class="line"></span><br><span class="line">b0:</span><br><span class="line">x1 =...</span><br><span class="line">jmp b1</span><br><span class="line"></span><br><span class="line">b1:  </span><br><span class="line">x2 = phi(x1, x3)      </span><br><span class="line">x3 = x2 + 1  </span><br><span class="line">...</span><br><span class="line">jmp p, b1, b2</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">-----------------------</span><br><span class="line"></span><br><span class="line">swap problem:</span><br><span class="line"></span><br><span class="line">b0:</span><br><span class="line">a1 =...</span><br><span class="line">b1 =...</span><br><span class="line">jmp b1</span><br><span class="line"></span><br><span class="line">b1:          </span><br><span class="line">a2 = phi(a1, b2)      </span><br><span class="line">b2 = phi(b1, a2)      </span><br><span class="line">    ...                      </span><br><span class="line">jmp p, b1, b2          </span><br><span class="line"></span><br></pre></td></tr></table></figure><p>我们的暴力算法会生成</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">swap problem:</span><br><span class="line">b0:</span><br><span class="line">a1 = ...</span><br><span class="line">b1 = ...</span><br><span class="line">a2 = copy a1</span><br><span class="line">b2 = copy b1</span><br><span class="line">jump b1</span><br><span class="line">b1:</span><br><span class="line">a2 = copy b2 &lt;--- 没有swap</span><br><span class="line">b2 = copy a2</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> p jump b1</span><br><span class="line">jump b2</span><br><span class="line">b2:</span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">--------------------------------------------</span><br><span class="line"></span><br><span class="line">lost-copy-problem</span><br><span class="line"></span><br><span class="line">b0:</span><br><span class="line">x1 = ...</span><br><span class="line">x2 = copy x1</span><br><span class="line">jump b1</span><br><span class="line">b1:</span><br><span class="line">x3 = x2 + <span class="number">1</span></span><br><span class="line">x2 = copy x3</span><br><span class="line"><span class="keyword">if</span> p jump b1</span><br><span class="line">jump b2</span><br><span class="line">b2:</span><br><span class="line">... = x2  &lt;---- 这里错误</span><br><span class="line"></span><br></pre></td></tr></table></figure><h3 id="split-critical-edges"><a href="#split-critical-edges" class="headerlink" title="split critical edges"></a>split critical edges</h3><p>图中红边就是critical edge: </p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">digraph G &#123;</span><br><span class="line">    // 不可见节点（不显示标签）</span><br><span class="line">    xxx [label=&quot;&quot;, shape=none, width=0, height=0];</span><br><span class="line">    others [label=&quot;&quot;, shape=none, width=0, height=0];</span><br><span class="line">    </span><br><span class="line">    // 可见节点</span><br><span class="line">    b1 [label=&quot;b1&quot;, shape=circle, style=filled, fillcolor=lightgray];</span><br><span class="line">    b2 [label=&quot;b2&quot;, shape=circle, style=filled, fillcolor=lightgray];</span><br><span class="line">    </span><br><span class="line">    // 边</span><br><span class="line">    xxx -&gt; b2 [];  </span><br><span class="line">    b1 -&gt; b2 [color=red, penwidth=2];</span><br><span class="line">    b1 -&gt; others [];  </span><br><span class="line">    </span><br><span class="line">    // 调整布局</span><br><span class="line">    rankdir=TD;  // 从左到右排列</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>对lost copy和swap应用该算法：</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">lost-copy-problem</span><br><span class="line"></span><br><span class="line">b0:</span><br><span class="line">x1 =...</span><br><span class="line">x2 = copy x1</span><br><span class="line">jmp b1</span><br><span class="line"></span><br><span class="line">b1:</span><br><span class="line">x3 = x2 + 1</span><br><span class="line">jmp p, b3, b2</span><br><span class="line"></span><br><span class="line">b3:</span><br><span class="line">x2 = copy x3</span><br><span class="line">jmp b1</span><br><span class="line"></span><br><span class="line">b2:</span><br><span class="line">... = x2</span><br><span class="line"></span><br><span class="line">---------------------------------------------------</span><br><span class="line"></span><br><span class="line">swap problem</span><br><span class="line"></span><br><span class="line">bb0:</span><br><span class="line">a1 =...</span><br><span class="line">b1 =...</span><br><span class="line">a2 = copy a1</span><br><span class="line">b2 = copy b1</span><br><span class="line">jmp bb1</span><br><span class="line"></span><br><span class="line">bb1:</span><br><span class="line"></span><br><span class="line">jmp p, bb3, bb2</span><br><span class="line"></span><br><span class="line">bb3:</span><br><span class="line">a2 = copy b2  &lt;--- 没有swap</span><br><span class="line">b2 = copy a2</span><br><span class="line">jmp bb1</span><br><span class="line"></span><br><span class="line">bb2:</span><br><span class="line">... = a2</span><br><span class="line">... = b2</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>swap problem结果看起来不太对哦。<br>看起来是插入copy出了问题, 再看一眼<code>a2=phi(a1, b2);b2=phi(b1,a2)</code> a2, b2形成了环。<br>是个环。。。</p><p>我们发现swap problem中</p><ul><li>a2和b2的生命周期重叠了</li><li>并且插入copy方式也需要改进</li></ul><h3 id="isolating-phi-parallel-copy"><a href="#isolating-phi-parallel-copy" class="headerlink" title="isolating phi + parallel copy"></a>isolating phi + parallel copy</h3><blockquote><p>先介绍几个概念</p><ul><li>CSSA(Conventional SSA) form is defined as SSA form for which each phi-web is interference-free.</li><li>TSSA(Transformed SSA) form is non-conventional SSA, may have phi-web is not interference-free.</li></ul></blockquote><p>swap problem和lost-copy problem中 都是TSSA，而不是CSSA。<br>如何将TSSA转换为CSSA？并且在CSSA中插入copy是否还需要注意类似swapproblem的情况？</p><p>所以我们接下来的算法步骤是：</p><ol><li>Insert parallel copies for all φ-functions （TSSA &#x3D;&gt; CSSA）</li><li>eliminate phis in CSSA</li><li>Sequentialize parallel copies, possibly with one more variable and some additional copies</li><li>some optimization</li></ol><p>首先转换为cssa格式</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">lost copy problem</span><br><span class="line"></span><br><span class="line">b0:</span><br><span class="line">x1 =...</span><br><span class="line">x1&#x27; = copy x1</span><br><span class="line">jmp b1</span><br><span class="line"></span><br><span class="line">b1:</span><br><span class="line">x2&#x27; = phi(x1&#x27;, x3&#x27;)</span><br><span class="line">x2 = copy x2&#x27;</span><br><span class="line">x3 = x2 + 1</span><br><span class="line">x3&#x27; = copy x3</span><br><span class="line"></span><br><span class="line">jmp p, b1, b2</span><br><span class="line"></span><br><span class="line">b2:</span><br><span class="line">... = x2</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">----------------------------------------</span><br><span class="line"></span><br><span class="line">swap problem:</span><br><span class="line"></span><br><span class="line">B0:</span><br><span class="line">a1 =...</span><br><span class="line">b1 =...</span><br><span class="line">a1&#x27; = copy a1</span><br><span class="line">b1&#x27; = copy b1</span><br><span class="line">jmp B1</span><br><span class="line"></span><br><span class="line">B1:</span><br><span class="line">a2&#x27; = phi(a1&#x27;, b2&#x27;)</span><br><span class="line">b2&#x27; = phi(b1&#x27;, a2&#x27;)</span><br><span class="line">a2 = copy a2&#x27;</span><br><span class="line">b2 = copy b2&#x27;</span><br><span class="line">b2&#x27; = copy b2</span><br><span class="line">a2&#x27; = copy a2</span><br><span class="line">   ...</span><br><span class="line">jmp p, B1, B2</span><br><span class="line"></span><br><span class="line">B2:</span><br><span class="line">... = a2</span><br><span class="line">... = b2</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>消除phi节点后：</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">lost copy problem</span><br><span class="line"></span><br><span class="line">b0:</span><br><span class="line">x1 =...</span><br><span class="line">x1&#x27; = copy x1</span><br><span class="line">x2&#x27; = copy x1&#x27;      // x1&#x27; in x2&#x27; = phi(x1&#x27;, x3&#x27;)</span><br><span class="line">jmp b1</span><br><span class="line"></span><br><span class="line">b1:</span><br><span class="line">x2 = copy x2&#x27;</span><br><span class="line">x3 = x2 + 1</span><br><span class="line">x3&#x27; = copy x3</span><br><span class="line">x2&#x27; = x3&#x27;            // x3&#x27; in x2&#x27; = phi(x1&#x27;, x3&#x27;)</span><br><span class="line">jmp p, b1, b2</span><br><span class="line"></span><br><span class="line">b2:</span><br><span class="line">... = x2</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">----------------------------------------</span><br><span class="line"></span><br><span class="line">swap problem:</span><br><span class="line"></span><br><span class="line">B0:</span><br><span class="line">a1 =...</span><br><span class="line">b1 =...</span><br><span class="line">a1&#x27; = copy a1</span><br><span class="line">b1&#x27; = copy b1</span><br><span class="line">a2&#x27; = copy a1&#x27;   // a1&#x27; in a2&#x27; = phi(a1&#x27;, b2&#x27;)</span><br><span class="line">b2&#x27; = copy b1&#x27; // b1&#x27; in b2&#x27; = phi(b1&#x27;, a2&#x27;)</span><br><span class="line">jmp B1</span><br><span class="line"></span><br><span class="line">B1:</span><br><span class="line">a2 = copy a2&#x27;</span><br><span class="line">b2 = copy b2&#x27;</span><br><span class="line">b2&#x27; = copy b2</span><br><span class="line">a2&#x27; = copy a2</span><br><span class="line">   a2&#x27; = copy b2&#x27;  // a2&#x27; in a2&#x27; = phi(a1&#x27;, b2&#x27;)</span><br><span class="line">b2&#x27; = copy a2&#x27;  // b2&#x27; in b2&#x27; = phi(b1&#x27;, a2&#x27;)</span><br><span class="line">jmp p, B1, B2</span><br><span class="line"></span><br><span class="line">B2:</span><br><span class="line">... = a2</span><br><span class="line">... = b2</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>上一小节提到的swap中copy的问题仍然存在，所以接下来要介绍，parallel copies的概念。<br>我们将这些插入的copy指令视为parallel copies，然后采用算法求解copy插入顺序</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line">// Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency (https://inria.hal.science/inria-00349925v1/document)</span><br><span class="line"></span><br><span class="line">// @args</span><br><span class="line">// Set P of parallel copies of the form a -&gt; b, a != b</span><br><span class="line">// n: one extra fresh variable</span><br><span class="line">// @output:  List of copies in sequential order</span><br><span class="line">def parallel_copy_sequentialization(P:set, n: variable) </span><br><span class="line"></span><br><span class="line">ready = []</span><br><span class="line">to_do = []</span><br><span class="line">pred(n) = none // a map</span><br><span class="line"></span><br><span class="line">for a -&gt; b in p</span><br><span class="line">loc(b) = none               // init</span><br><span class="line">pred(b) = none</span><br><span class="line">end for</span><br><span class="line"></span><br><span class="line">for a -&gt; b in p    </span><br><span class="line">loc(a) = a                    /* needed and not copied yet */</span><br><span class="line">pred(b) = a                   /* (unique) predecessor *</span><br><span class="line">to_do.append(b)               /* copy into b to be done */</span><br><span class="line"></span><br><span class="line">for a-&gt;b in p</span><br><span class="line">if loc(b) == none</span><br><span class="line">ready.append(b)            /* b is not used and can be overwritten */</span><br><span class="line"></span><br><span class="line">while to_do != []</span><br><span class="line">while ready != []</span><br><span class="line">b = ready.pop()</span><br><span class="line">a = pred(b)</span><br><span class="line">c = loc(a)</span><br><span class="line">emit copy c -&gt; b</span><br><span class="line"></span><br><span class="line">loc(a) = b</span><br><span class="line">if a == c and pred(a) != none</span><br><span class="line">ready.append(a)</span><br><span class="line">b = to_do.pop()</span><br><span class="line">l = loc(pred(b))</span><br><span class="line">if b == l</span><br><span class="line">emit copy b -&gt; n</span><br><span class="line">loc(b) = n</span><br><span class="line">ready.append(b)</span><br><span class="line"></span><br></pre></td></tr></table></figure><blockquote><p>不幸的是，这个算法有点小问题。<br>实际上是个图遍历算法：<a href="http://web.cs.ucla.edu/~palsberg/paper/cc09.pdf">cc09.pdf</a></p></blockquote><p>我们用python写个demo：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Copy</span>:</span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self, src, dst</span>):</span><br><span class="line">        <span class="variable language_">self</span>.src = src</span><br><span class="line">        <span class="variable language_">self</span>.dst = dst</span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__repr__</span>(<span class="params">self</span>):</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;%s &lt;- %s&quot;</span> % (<span class="variable language_">self</span>.dst, <span class="variable language_">self</span>.src)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">seq_copy</span>(<span class="params">seq</span>):</span><br><span class="line">    ready = []</span><br><span class="line">    todo = []</span><br><span class="line">    pred = &#123;&#125;</span><br><span class="line">    loc = &#123;&#125;</span><br><span class="line">    n = <span class="string">&quot;tmp&quot;</span></span><br><span class="line">  </span><br><span class="line">    pred[n] = <span class="literal">None</span></span><br><span class="line">    <span class="keyword">for</span> i <span class="keyword">in</span> seq:</span><br><span class="line">        a = i.src</span><br><span class="line">        b = i.dst</span><br><span class="line">        loc[b] = <span class="literal">None</span></span><br><span class="line">        pred[a] = <span class="literal">None</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> i <span class="keyword">in</span> seq:</span><br><span class="line">        a = i.src</span><br><span class="line">        b = i.dst</span><br><span class="line">        loc[a] = a</span><br><span class="line">        pred[b] = a</span><br><span class="line">        todo.append(b)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> i <span class="keyword">in</span> seq:</span><br><span class="line">        a = i.src</span><br><span class="line">        b = i.dst</span><br><span class="line">        <span class="keyword">if</span> loc[b] <span class="keyword">is</span> <span class="literal">None</span>:</span><br><span class="line">            ready.append(b)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="keyword">while</span> <span class="built_in">len</span>(todo) != <span class="number">0</span>:</span><br><span class="line">        <span class="keyword">while</span> <span class="built_in">len</span>(ready) != <span class="number">0</span>:</span><br><span class="line">            b = ready.pop()</span><br><span class="line">            a = pred[b]</span><br><span class="line">            c = loc[a]</span><br><span class="line">            <span class="built_in">print</span>(<span class="string">&quot;emit copy &#123;&#125; &lt;- &#123;&#125;&quot;</span>.<span class="built_in">format</span>(b, c))</span><br><span class="line">            loc[a] = b</span><br><span class="line">            <span class="keyword">if</span> a ==c <span class="keyword">and</span> pred[a] <span class="keyword">is</span> <span class="keyword">not</span> <span class="literal">None</span>:</span><br><span class="line">                ready.append(a)</span><br><span class="line">        b = todo.pop(<span class="number">0</span>)</span><br><span class="line">        l = loc[pred[b]]</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> b == l: <span class="comment"># &lt;&lt;&lt;&lt;&lt; 这里</span></span><br><span class="line">            <span class="built_in">print</span>(<span class="string">&quot;emit copy &#123;&#125;&lt;- &#123;&#125;&quot;</span>.<span class="built_in">format</span>(n, b))</span><br><span class="line">            loc[b] = n</span><br><span class="line">            ready.append(b)</span><br><span class="line">           </span><br><span class="line">seq= [</span><br><span class="line">    Copy(<span class="string">&quot;b&quot;</span>, <span class="string">&quot;a&quot;</span>),</span><br><span class="line">    Copy(<span class="string">&quot;a&quot;</span>, <span class="string">&quot;b&quot;</span>),</span><br><span class="line">]</span><br><span class="line"></span><br><span class="line">seq_copy(seq)</span><br></pre></td></tr></table></figure><blockquote><p>运行脚本，啥也不输出。<br>但是如果我们将 <code>b==l</code>改成<code>b!=l</code>, 就会输出<br>emit copy tmp&lt;- a<br>emit copy a &lt;- b<br>emit copy b &lt;- tmp</p></blockquote><p>至此，我们完成了phi-elimination的一半，合法化问题解决了。但是性能问题没解决。</p><p>对于lost copy problem，在BB块前插入copy指令是必须的。（如果采用Split Critical Edges的方式，循环的代码质量会下降）<br>对于swap problem，需检测环的出现。</p><p>copy插入位置问题，lost copy 和swap problem插入顺序还不一样。<br><del>这帮人写论文能不能靠谱点</del></p><h3 id="llvm采用的算法"><a href="#llvm采用的算法" class="headerlink" title="llvm采用的算法"></a>llvm采用的算法</h3><p>llvm有split critical edge也有不依赖split critical edge的实现。<br>parallel copy面对如下代码还是有些问题</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 格式化整型的代码实现</span></span><br><span class="line"><span class="type">int</span> value = ....;</span><br><span class="line"><span class="keyword">do</span>&#123;</span><br><span class="line"><span class="type">int</span> a = value % base;</span><br><span class="line">value = value / base;</span><br><span class="line"></span><br><span class="line">buf[len++] = a;</span><br><span class="line"></span><br><span class="line">&#125; <span class="keyword">while</span>(value)</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>原因还是copy插入顺序。（parallel copies的遍历顺序）<br>解决方法，额。<br>根据llvm中split critical edges的方法。<br>如果phi的incoming block是critical edge 的src block，例如 <code>bb0：a0 = phi a1,bb1, a2, bb2</code> 其中bb2是critical edge，那么在bb1尾部创建<code>tmp = copy a1</code>, 在bb0头部<code>a0 = copy tmp</code>, 在bb2尾部<code>tmp = copy a2</code>.<br>如果没有critical edge, 直接copy，bb1尾部创建<code>a0 = copy a1</code>,bb2尾部<code>a0 = copy a2</code>.</p><p>很简单的实现，但是还是挺有效。。。</p><p><a href="https://gcc.godbolt.org/z/aMWPjK3a9">https://gcc.godbolt.org/z/aMWPjK3a9</a></p><p>为什么不split block呢？因为跳转太多会导致性能下降，特别是在loop中。</p><hr><h2 id="优化问题"><a href="#优化问题" class="headerlink" title="优化问题"></a>优化问题</h2><p>关键字：copy de-coalescing</p><p>即如何最小化插入copy指令 </p><ol><li>在CSSA上面对parallel copies 优化</li><li>先生成很多冗余的copy指令，然后再优化这些copy指令</li></ol><p>第一种方式有些困难，个人认为难点在于parallel copies相关生命周期如何计算。<br>第二种方式就很传统了, interference graph + union find， 不干涉的放在一个点集里面。</p><hr><h2 id="others"><a href="#others" class="headerlink" title="others"></a>others</h2><p>我们重新关注下parallel copies 。</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line">对于</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">bb1:</span><br><span class="line">a1 = ...</span><br><span class="line">b1 = ...</span><br><span class="line">c1 = ...</span><br><span class="line">jump bb</span><br><span class="line"></span><br><span class="line">bb2:</span><br><span class="line">a2 = ...</span><br><span class="line">b2 = ...</span><br><span class="line">c2 = ...</span><br><span class="line">jump bb</span><br><span class="line"></span><br><span class="line">bb3:</span><br><span class="line">a3 = ...</span><br><span class="line">b3 = ...</span><br><span class="line">c3 = ...</span><br><span class="line">jump bb</span><br><span class="line"></span><br><span class="line">bb:</span><br><span class="line">a = phi(a1, a2, a3)</span><br><span class="line">b = phi(b1, b2, b3)</span><br><span class="line">c = phi(c1, c2, c3)</span><br><span class="line"></span><br></pre></td></tr></table></figure><hr><blockquote><p>注意<code>\\</code>可能会被识别为<code>\</code>, 用<code>\newline</code>更好</p></blockquote><p>$\begin{bmatrix} a \newline b \newline c \end{bmatrix}<br> &#x3D; Φ \begin{bmatrix} a1 &amp; a2 &amp; a3 \newline b1 &amp; b2 &amp; b3 \newline c1 &amp;c2 &amp; c3 \end{bmatrix} $</p><p>那么对于bb1来说，就有parallel copies: $a \gets a1, b \gets b1, c \gets c1$。<br>其对应的<strong>Location Transfer Graph</strong>为$G &#x3D; (V, E), V &#x3D; \lbrace a,b,c, a1, b1, c1\rbrace, E &#x3D; \lbrace a \gets a1, b \gets b1, c \gets c1 \rbrace$</p><p>再看下swap problem里面的</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">a2 = phi(a1, b2)</span><br><span class="line">b2 = phi(b1, a2)</span><br></pre></td></tr></table></figure><ol><li>前驱1的parallel copies</li></ol><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">flowchart TD</span><br><span class="line">  a1 --&gt; a2</span><br><span class="line">  b1 --&gt; b2</span><br></pre></td></tr></table></figure><ol start="2"><li>前驱2的parallel copies</li></ol><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">flowchart TD</span><br><span class="line">  a2 --&gt; b2</span><br><span class="line">  b2 --&gt; a2</span><br></pre></td></tr></table></figure><hr><p>参考</p><ul><li><a href="https://www.cs.utexas.edu/~pingali/CS380C/2010/papers/ssaCytron.pdf">https://www.cs.utexas.edu/~pingali/CS380C/2010/papers/ssaCytron.pdf</a></li><li><a href="https://ics.uci.edu/~yeouln/course/ssa.pdf">https://ics.uci.edu/~yeouln/course/ssa.pdf</a></li><li><a href="https://inria.hal.science/inria-00349925v1/document">Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency</a></li><li><a href="http://web.cs.ucla.edu/~palsberg/paper/cc09.pdf">cc09.pdf</a></li></ul><hr><p>llvm实现。</p><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">int cond_random();</span><br><span class="line"></span><br><span class="line">void swap_p(int a, int b, int *arr)&#123;</span><br><span class="line"></span><br><span class="line">    while(cond_random() )&#123;</span><br><span class="line">        int tmp = a;</span><br><span class="line">        a = b;</span><br><span class="line">        b = tmp;</span><br><span class="line">    &#125;</span><br><span class="line">    *arr++ = a;</span><br><span class="line">    *arr++ = b;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">void lost_copy(int a, int *p)&#123;</span><br><span class="line"></span><br><span class="line">    while(cond_random())&#123;</span><br><span class="line">        a+=1;</span><br><span class="line">    &#125;</span><br><span class="line">    *p = a;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h3 id="lost-copy"><a href="#lost-copy" class="headerlink" title="lost copy"></a>lost copy</h3><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"># Machine code for function lost_copy(int, int*): IsSSA, TracksLiveness</span><br><span class="line">Function Live Ins: $edi in %3, $rsi in %4</span><br><span class="line"></span><br><span class="line">bb.0.entry:</span><br><span class="line">  successors: %bb.1(0x80000000); %bb.1(100.00%)</span><br><span class="line">  liveins: $edi, $rsi</span><br><span class="line">  %4:gr64 = COPY killed $rsi</span><br><span class="line">  %3:gr32 = COPY killed $edi</span><br><span class="line">  %0:gr32 = DEC32r killed %3:gr32(tied-def 0), implicit-def dead $eflags; example.cpp:19:5</span><br><span class="line"></span><br><span class="line">bb.1.while.cond:</span><br><span class="line">; predecessors: %bb.0, %bb.1</span><br><span class="line">  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)</span><br><span class="line"></span><br><span class="line">  %1:gr32 = PHI %0:gr32, %bb.0, %2:gr32, %bb.1, debug-instr-number 1</span><br><span class="line">  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:19:11</span><br><span class="line">  CALL64pcrel32 target-flags(x86-plt) @cond_random(), &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit-def $eax; example.cpp:19:11</span><br><span class="line">  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:19:11</span><br><span class="line">  %5:gr32 = COPY killed $eax; example.cpp:19:11</span><br><span class="line">  %2:gr32 = INC32r killed %1:gr32(tied-def 0), implicit-def dead $eflags; example.cpp:19:5</span><br><span class="line">  TEST32rr killed %5:gr32, %5:gr32, implicit-def $eflags; example.cpp:19:11</span><br><span class="line">  JCC_1 %bb.1, 5, implicit killed $eflags; example.cpp:19:5</span><br><span class="line">  JMP_1 %bb.2; example.cpp:19:5</span><br><span class="line"></span><br><span class="line">bb.2.while.end:</span><br><span class="line">; predecessors: %bb.1</span><br><span class="line"></span><br><span class="line">  MOV32mr killed %4:gr64, 1, $noreg, 0, $noreg, killed %2:gr32 :: (store (s32) into %ir.p); example.cpp:22:8</span><br><span class="line">  RET 0; example.cpp:23:1</span><br><span class="line"></span><br><span class="line"># End machine code for function lost_copy(int, int*).</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"># Machine code for function lost_copy(int, int*): NoPHIs, TracksLiveness</span><br><span class="line">Function Live Ins: $edi in %3, $rsi in %4</span><br><span class="line"></span><br><span class="line">bb.0.entry:</span><br><span class="line">  successors: %bb.1(0x80000000); %bb.1(100.00%)</span><br><span class="line">  liveins: $edi, $rsi</span><br><span class="line">  %4:gr64 = COPY killed $rsi</span><br><span class="line">  %3:gr32 = COPY killed $edi</span><br><span class="line">  %0:gr32 = DEC32r killed %3:gr32(tied-def 0), implicit-def dead $eflags; example.cpp:19:5</span><br><span class="line">  %6:gr32 = COPY killed %0:gr32</span><br><span class="line"></span><br><span class="line">bb.1.while.cond:</span><br><span class="line">; predecessors: %bb.0, %bb.1</span><br><span class="line">  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)</span><br><span class="line"></span><br><span class="line">  %1:gr32 = COPY killed %6:gr32</span><br><span class="line">  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:19:11</span><br><span class="line">  CALL64pcrel32 target-flags(x86-plt) @cond_random(), &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit-def $eax; example.cpp:19:11</span><br><span class="line">  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:19:11</span><br><span class="line">  %5:gr32 = COPY killed $eax; example.cpp:19:11</span><br><span class="line">  %2:gr32 = INC32r killed %1:gr32(tied-def 0), implicit-def dead $eflags; example.cpp:19:5</span><br><span class="line">  TEST32rr killed %5:gr32, %5:gr32, implicit-def $eflags; example.cpp:19:11</span><br><span class="line">  %6:gr32 = COPY %2:gr32</span><br><span class="line">  JCC_1 %bb.1, 5, implicit killed $eflags; example.cpp:19:5</span><br><span class="line">  JMP_1 %bb.2; example.cpp:19:5</span><br><span class="line"></span><br><span class="line">bb.2.while.end:</span><br><span class="line">; predecessors: %bb.1</span><br><span class="line"></span><br><span class="line">  MOV32mr killed %4:gr64, 1, $noreg, 0, $noreg, killed %2:gr32 :: (store (s32) into %ir.p); example.cpp:22:8</span><br><span class="line">  RET 0; example.cpp:23:1</span><br><span class="line"></span><br><span class="line"># End machine code for function lost_copy(int, int*).</span><br><span class="line"></span><br><span class="line"># Machine code for function lost_copy(int, int*): NoPHIs, TracksLiveness, TiedOpsRewritten</span><br><span class="line">Function Live Ins: $edi in %3, $rsi in %4</span><br><span class="line"></span><br><span class="line">0Bbb.0.entry:</span><br><span class="line">  successors: %bb.1(0x80000000); %bb.1(100.00%)</span><br><span class="line">  liveins: $edi, $rsi</span><br><span class="line">16B  %4:gr64 = COPY $rsi</span><br><span class="line">32B  %6:gr32 = COPY $edi</span><br><span class="line">64B  %6:gr32 = DEC32r %6:gr32(tied-def 0), implicit-def dead $eflags; example.cpp:19:5</span><br><span class="line"></span><br><span class="line">96Bbb.1.while.cond:</span><br><span class="line">; predecessors: %bb.0, %bb.1</span><br><span class="line">  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)</span><br><span class="line"></span><br><span class="line">128B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:19:11</span><br><span class="line">144B  CALL64pcrel32 target-flags(x86-plt) @cond_random(), &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit-def $eax; example.cpp:19:11</span><br><span class="line">160B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:19:11</span><br><span class="line">176B  %5:gr32 = COPY killed $eax; example.cpp:19:11</span><br><span class="line">208B  %6:gr32 = INC32r %6:gr32(tied-def 0), implicit-def dead $eflags; example.cpp:19:5</span><br><span class="line">224B  TEST32rr %5:gr32, %5:gr32, implicit-def $eflags; example.cpp:19:11</span><br><span class="line">256B  JCC_1 %bb.1, 5, implicit killed $eflags; example.cpp:19:5</span><br><span class="line">272B  JMP_1 %bb.2; example.cpp:19:5</span><br><span class="line"></span><br><span class="line">288Bbb.2.while.end:</span><br><span class="line">; predecessors: %bb.1</span><br><span class="line"></span><br><span class="line">304B  MOV32mr %4:gr64, 1, $noreg, 0, $noreg, %6:gr32 :: (store (s32) into %ir.p); example.cpp:22:8</span><br><span class="line">320B  RET 0; example.cpp:23:1</span><br><span class="line"></span><br><span class="line"># End machine code for function lost_copy(int, int*).</span><br><span class="line"></span><br></pre></td></tr></table></figure><h3 id="swap"><a href="#swap" class="headerlink" title="swap"></a>swap</h3><figure class="highlight txt"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br></pre></td><td class="code"><pre><span class="line"></span><br><span class="line"># Machine code for function swap_p(int, int, int*): IsSSA, TracksLiveness</span><br><span class="line">Function Live Ins: $edi in %2, $esi in %3, $rdx in %4</span><br><span class="line"></span><br><span class="line">bb.0.entry:</span><br><span class="line">  successors: %bb.1(0x80000000); %bb.1(100.00%)</span><br><span class="line">  liveins: $edi, $esi, $rdx</span><br><span class="line">  %4:gr64 = COPY killed $rdx</span><br><span class="line">  %3:gr32 = COPY killed $esi</span><br><span class="line">  %2:gr32 = COPY killed $edi</span><br><span class="line"></span><br><span class="line">bb.1.while.cond:</span><br><span class="line">; predecessors: %bb.0, %bb.1</span><br><span class="line">  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)</span><br><span class="line"></span><br><span class="line">  %0:gr32 = PHI %3:gr32, %bb.0, %1:gr32, %bb.1, debug-instr-number 2</span><br><span class="line">  %1:gr32 = PHI %2:gr32, %bb.0, %0:gr32, %bb.1, debug-instr-number 1</span><br><span class="line">  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:7:11</span><br><span class="line">  CALL64pcrel32 target-flags(x86-plt) @cond_random(), &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit-def $eax; example.cpp:7:11</span><br><span class="line">  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:7:11</span><br><span class="line">  %5:gr32 = COPY killed $eax; example.cpp:7:11</span><br><span class="line">  TEST32rr killed %5:gr32, %5:gr32, implicit-def $eflags; example.cpp:7:11</span><br><span class="line">  JCC_1 %bb.1, 5, implicit killed $eflags; example.cpp:7:5</span><br><span class="line">  JMP_1 %bb.2; example.cpp:7:5</span><br><span class="line"></span><br><span class="line">bb.2.while.end:</span><br><span class="line">; predecessors: %bb.1</span><br><span class="line"></span><br><span class="line">  MOV32mr %4:gr64, 1, $noreg, 0, $noreg, killed %1:gr32 :: (store (s32) into %ir.arr); example.cpp:12:12</span><br><span class="line">  MOV32mr killed %4:gr64, 1, $noreg, 4, $noreg, killed %0:gr32 :: (store (s32) into %ir.incdec.ptr); example.cpp:13:12</span><br><span class="line">  RET 0; example.cpp:14:1</span><br><span class="line"></span><br><span class="line"># End machine code for function swap_p(int, int, int*).</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"># Machine code for function swap_p(int, int, int*): NoPHIs, TracksLiveness</span><br><span class="line">Function Live Ins: $edi in %2, $esi in %3, $rdx in %4</span><br><span class="line"></span><br><span class="line">bb.0.entry:</span><br><span class="line">  successors: %bb.1(0x80000000); %bb.1(100.00%)</span><br><span class="line">  liveins: $edi, $esi, $rdx</span><br><span class="line">  %4:gr64 = COPY killed $rdx</span><br><span class="line">  %3:gr32 = COPY killed $esi</span><br><span class="line">  %2:gr32 = COPY killed $edi</span><br><span class="line">  %6:gr32 = COPY killed %3:gr32</span><br><span class="line">  %7:gr32 = COPY killed %2:gr32</span><br><span class="line"></span><br><span class="line">bb.1.while.cond:</span><br><span class="line">; predecessors: %bb.0, %bb.1</span><br><span class="line">  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)</span><br><span class="line"></span><br><span class="line">  %1:gr32 = COPY killed %7:gr32</span><br><span class="line">  %0:gr32 = COPY killed %6:gr32</span><br><span class="line">  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:7:11</span><br><span class="line">  CALL64pcrel32 target-flags(x86-plt) @cond_random(), &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit-def $eax; example.cpp:7:11</span><br><span class="line">  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:7:11</span><br><span class="line">  %5:gr32 = COPY killed $eax; example.cpp:7:11</span><br><span class="line">  TEST32rr killed %5:gr32, %5:gr32, implicit-def $eflags; example.cpp:7:11</span><br><span class="line">  %6:gr32 = COPY %1:gr32</span><br><span class="line">  %7:gr32 = COPY %0:gr32</span><br><span class="line">  JCC_1 %bb.1, 5, implicit killed $eflags; example.cpp:7:5</span><br><span class="line">  JMP_1 %bb.2; example.cpp:7:5</span><br><span class="line"></span><br><span class="line">bb.2.while.end:</span><br><span class="line">; predecessors: %bb.1</span><br><span class="line"></span><br><span class="line">  MOV32mr %4:gr64, 1, $noreg, 0, $noreg, killed %1:gr32 :: (store (s32) into %ir.arr); example.cpp:12:12</span><br><span class="line">  MOV32mr killed %4:gr64, 1, $noreg, 4, $noreg, killed %0:gr32 :: (store (s32) into %ir.incdec.ptr); example.cpp:13:12</span><br><span class="line">  RET 0; example.cpp:14:1</span><br><span class="line"></span><br><span class="line"># End machine code for function swap_p(int, int, int*).</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"># Machine code for function swap_p(int, int, int*): NoPHIs, TracksLiveness, TiedOpsRewritten</span><br><span class="line">Function Live Ins: $edi in %2, $esi in %3, $rdx in %4</span><br><span class="line"></span><br><span class="line">0Bbb.0.entry:</span><br><span class="line">  successors: %bb.1(0x80000000); %bb.1(100.00%)</span><br><span class="line">  liveins: $edi, $esi, $rdx</span><br><span class="line">16B  %4:gr64 = COPY $rdx</span><br><span class="line">32B  %6:gr32 = COPY $esi</span><br><span class="line">48B  %7:gr32 = COPY $edi</span><br><span class="line"></span><br><span class="line">96Bbb.1.while.cond:</span><br><span class="line">; predecessors: %bb.0, %bb.1</span><br><span class="line">  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)</span><br><span class="line"></span><br><span class="line">112B  %1:gr32 = COPY %7:gr32</span><br><span class="line">128B  %7:gr32 = COPY %6:gr32</span><br><span class="line">144B  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:7:11</span><br><span class="line">160B  CALL64pcrel32 target-flags(x86-plt) @cond_random(), &lt;regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...&gt;, implicit $rsp, implicit $ssp, implicit-def $rsp, implicit-def $ssp, implicit-def $eax; example.cpp:7:11</span><br><span class="line">176B  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp; example.cpp:7:11</span><br><span class="line">192B  %5:gr32 = COPY killed $eax; example.cpp:7:11</span><br><span class="line">208B  TEST32rr %5:gr32, %5:gr32, implicit-def $eflags; example.cpp:7:11</span><br><span class="line">224B  %6:gr32 = COPY %1:gr32</span><br><span class="line">256B  JCC_1 %bb.1, 5, implicit killed $eflags; example.cpp:7:5</span><br><span class="line">272B  JMP_1 %bb.2; example.cpp:7:5</span><br><span class="line"></span><br><span class="line">288Bbb.2.while.end:</span><br><span class="line">; predecessors: %bb.1</span><br><span class="line"></span><br><span class="line">304B  MOV32mr %4:gr64, 1, $noreg, 0, $noreg, %1:gr32 :: (store (s32) into %ir.arr); example.cpp:12:12</span><br><span class="line">320B  MOV32mr %4:gr64, 1, $noreg, 4, $noreg, %7:gr32 :: (store (s32) into %ir.incdec.ptr); example.cpp:13:12</span><br><span class="line">336B  RET 0; example.cpp:14:1</span><br><span class="line"></span><br><span class="line"># End machine code for function swap_p(int, int, int*).</span><br><span class="line"></span><br></pre></td></tr></table></figure>]]>
    </content>
    <id>http://example.com/2025/09/20/SSA/</id>
    <link href="http://example.com/2025/09/20/SSA/"/>
    <published>2025-09-20T00:00:00.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>phinode的创建和消除</p>
</blockquote>
<h1 id="1-SSA构建算法"><a href="#1-SSA构建算法" class="headerlink" title="1. SSA构建算法"></a>1. SSA构建算法]]>
    </summary>
    <title>SSA构建和销毁</title>
    <updated>2026-04-28T13:16:41.644Z</updated>
  </entry>
</feed>
