| <!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><meta name="generator" content="rustdoc"><meta name="description" content="API documentation for the Rust `core_arch` crate."><meta name="keywords" content="rust, rustlang, rust-lang, core_arch"><title>core_arch - Rust</title><link rel="stylesheet" type="text/css" href="../normalize.css"><link rel="stylesheet" type="text/css" href="../rustdoc.css" id="mainThemeStyle"><link rel="stylesheet" type="text/css" href="../dark.css"><link rel="stylesheet" type="text/css" href="../light.css" id="themeStyle"><script src="../storage.js"></script><noscript><link rel="stylesheet" href="../noscript.css"></noscript><link rel="shortcut icon" href="../favicon.ico"><style type="text/css">#crate-search{background-image:url("../down-arrow.svg");}</style></head><body class="rustdoc mod"><!--[if lte IE 8]><div class="warning">This old browser is unsupported and will most likely display funky things.</div><![endif]--><nav class="sidebar"><div class="sidebar-menu">☰</div><a href='../core_arch/index.html'><div class='logo-container'><img src='../rust-logo.png' alt='logo'></div></a><p class='location'>Crate core_arch</p><div class="sidebar-elems"><a id='all-types' href='all.html'><p>See all core_arch's items</p></a><div class="block items"><ul><li><a href="#modules">Modules</a></li></ul></div><p class='location'></p><script>window.sidebarCurrent = {name: 'core_arch', ty: 'mod', relpath: '../'};</script></div></nav><div class="theme-picker"><button id="theme-picker" aria-label="Pick another theme!"><img src="../brush.svg" width="18" alt="Pick another theme!"></button><div id="theme-choices"></div></div><script src="../theme.js"></script><nav class="sub"><form class="search-form js-only"><div class="search-container"><div><select id="crate-search"><option value="All crates">All crates</option></select><input class="search-input" name="search" autocomplete="off" spellcheck="false" placeholder="Click or press ‘S’ to search, ‘?’ for more options…" type="search"></div><a id="settings-menu" href="../settings.html"><img src="../wheel.svg" width="18" alt="Change settings"></a></div></form></nav><section id="main" class="content"><h1 class='fqn'><span class='out-of-band'><span class='since' title='Stable since Rust version '></span><span id='render-detail'><a id="toggle-all-docs" href="javascript:void(0)" title="collapse all docs">[<span class='inner'>−</span>]</a></span><a class='srclink' href='../src/core_arch/lib.rs.html#1-78' title='goto source code'>[src]</a></span><span class='in-band'>Crate <a class="mod" href=''>core_arch</a></span></h1><div class='stability'><div class='stab unstable'><span class='emoji'>🔬</span> This is a nightly-only experimental API. (<code>stdsimd</code>)</div></div><div class='docblock'><p>SIMD and vendor intrinsics module.</p> |
| <p>This module is intended to be the gateway to architecture-specific |
| intrinsic functions, typically related to SIMD (but not always!). Each |
| architecture that Rust compiles to may contain a submodule here, which |
| means that this is not a portable module! If you're writing a portable |
| library take care when using these APIs!</p> |
| <p>Under this module you'll find an architecture-named module, such as |
| <code>x86_64</code>. Each <code>#[cfg(target_arch)]</code> that Rust can compile to may have a |
| module entry here, only present on that particular target. For example the |
| <code>i686-pc-windows-msvc</code> target will have an <code>x86</code> module here, whereas |
| <code>x86_64-pc-windows-msvc</code> has <code>x86_64</code>.</p> |
| <h1 id="overview" class="section-header"><a href="#overview">Overview</a></h1> |
| <p>This module exposes vendor-specific intrinsics that typically correspond to |
| a single machine instruction. These intrinsics are not portable: their |
| availability is architecture-dependent, and not all machines of that |
| architecture might provide the intrinsic.</p> |
| <p>The <code>arch</code> module is intended to be a low-level implementation detail for |
| higher-level APIs. Using it correctly can be quite tricky as you need to |
| ensure at least a few guarantees are upheld:</p> |
| <ul> |
| <li>The correct architecture's module is used. For example the <code>arm</code> module |
| isn't available on the <code>x86_64-unknown-linux-gnu</code> target. This is |
| typically done by ensuring that <code>#[cfg]</code> is used appropriately when using |
| this module.</li> |
| <li>The CPU the program is currently running on supports the function being |
| called. For example it is unsafe to call an AVX2 function on a CPU that |
| doesn't actually support AVX2.</li> |
| </ul> |
| <p>As a result of the latter of these guarantees all intrinsics in this module |
| are <code>unsafe</code> and extra care needs to be taken when calling them!</p> |
| <h1 id="cpu-feature-detection" class="section-header"><a href="#cpu-feature-detection">CPU Feature Detection</a></h1> |
| <p>In order to call these APIs in a safe fashion there's a number of |
| mechanisms available to ensure that the correct CPU feature is available |
| to call an intrinsic. Let's consider, for example, the <code>_mm256_add_epi64</code> |
| intrinsics on the <code>x86</code> and <code>x86_64</code> architectures. This function requires |
| the AVX2 feature as <a href="https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_add_epi64&expand=100">documented by Intel</a> so to correctly call |
| this function we need to (a) guarantee we only call it on <code>x86</code>/<code>x86_64</code> |
| and (b) ensure that the CPU feature is available</p> |
| <h2 id="static-cpu-feature-detection" class="section-header"><a href="#static-cpu-feature-detection">Static CPU Feature Detection</a></h2> |
| <p>The first option available to us is to conditionally compile code via the |
| <code>#[cfg]</code> attribute. CPU features correspond to the <code>target_feature</code> cfg |
| available, and can be used like so:</p> |
| |
| <div class='information'><div class='tooltip ignore'>ⓘ<span class='tooltiptext'>This example is not tested</span></div></div><div class="example-wrap"><pre class="rust rust-example-rendered ignore"> |
| <span class="attribute">#[<span class="ident">cfg</span>( |
| <span class="ident">all</span>( |
| <span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>), |
| <span class="ident">target_feature</span> <span class="op">=</span> <span class="string">"avx2"</span> |
| ) |
| )]</span> |
| <span class="kw">fn</span> <span class="ident">foo</span>() { |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>)]</span> |
| <span class="kw">use</span> <span class="ident">std</span>::<span class="ident">arch</span>::<span class="ident">x86</span>::<span class="ident">_mm256_add_epi64</span>; |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>)]</span> |
| <span class="kw">use</span> <span class="ident">std</span>::<span class="ident">arch</span>::<span class="ident">x86_64</span>::<span class="ident">_mm256_add_epi64</span>; |
| |
| <span class="kw">unsafe</span> { |
| <span class="ident">_mm256_add_epi64</span>(...); |
| } |
| }</pre></div> |
| <p>Here we're using <code>#[cfg(target_feature = "avx2")]</code> to conditionally compile |
| this function into our module. This means that if the <code>avx2</code> feature is |
| <em>enabled statically</em> then we'll use the <code>_mm256_add_epi64</code> function at |
| runtime. The <code>unsafe</code> block here can be justified through the usage of |
| <code>#[cfg]</code> to only compile the code in situations where the safety guarantees |
| are upheld.</p> |
| <p>Statically enabling a feature is typically done with the <code>-C target-feature</code> or <code>-C target-cpu</code> flags to the compiler. For example if |
| your local CPU supports AVX2 then you can compile the above function with:</p> |
| <pre><code class="language-sh">$ RUSTFLAGS='-C target-cpu=native' cargo build |
| </code></pre> |
| <p>Or otherwise you can specifically enable just the AVX2 feature:</p> |
| <pre><code class="language-sh">$ RUSTFLAGS='-C target-feature=+avx2' cargo build |
| </code></pre> |
| <p>Note that when you compile a binary with a particular feature enabled it's |
| important to ensure that you only run the binary on systems which satisfy |
| the required feature set.</p> |
| <h2 id="dynamic-cpu-feature-detection" class="section-header"><a href="#dynamic-cpu-feature-detection">Dynamic CPU Feature Detection</a></h2> |
| <p>Sometimes statically dispatching isn't quite what you want. Instead you |
| might want to build a portable binary that runs across a variety of CPUs, |
| but at runtime it selects the most optimized implementation available. This |
| allows you to build a "least common denominator" binary which has certain |
| sections more optimized for different CPUs.</p> |
| <p>Taking our previous example from before, we're going to compile our binary |
| <em>without</em> AVX2 support, but we'd like to enable it for just one function. |
| We can do that in a manner like:</p> |
| |
| <div class='information'><div class='tooltip ignore'>ⓘ<span class='tooltiptext'>This example is not tested</span></div></div><div class="example-wrap"><pre class="rust rust-example-rendered ignore"> |
| <span class="kw">fn</span> <span class="ident">foo</span>() { |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>))]</span> |
| { |
| <span class="kw">if</span> <span class="macro">is_x86_feature_detected</span><span class="macro">!</span>(<span class="string">"avx2"</span>) { |
| <span class="kw">return</span> <span class="kw">unsafe</span> { <span class="ident">foo_avx2</span>() }; |
| } |
| } |
| |
| <span class="comment">// fallback implementation without using AVX2</span> |
| } |
| |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>))]</span> |
| <span class="attribute">#[<span class="ident">target_feature</span>(<span class="ident">enable</span> <span class="op">=</span> <span class="string">"avx2"</span>)]</span> |
| <span class="kw">unsafe</span> <span class="kw">fn</span> <span class="ident">foo_avx2</span>() { |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>)]</span> |
| <span class="kw">use</span> <span class="ident">std</span>::<span class="ident">arch</span>::<span class="ident">x86</span>::<span class="ident">_mm256_add_epi64</span>; |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>)]</span> |
| <span class="kw">use</span> <span class="ident">std</span>::<span class="ident">arch</span>::<span class="ident">x86_64</span>::<span class="ident">_mm256_add_epi64</span>; |
| |
| <span class="ident">_mm256_add_epi64</span>(...); |
| }</pre></div> |
| <p>There's a couple of components in play here, so let's go through them in |
| detail!</p> |
| <ul> |
| <li> |
| <p>First up we notice the <code>is_x86_feature_detected!</code> macro. Provided by |
| the standard library, this macro will perform necessary runtime detection |
| to determine whether the CPU the program is running on supports the |
| specified feature. In this case the macro will expand to a boolean |
| expression evaluating to whether the local CPU has the AVX2 feature or |
| not.</p> |
| <p>Note that this macro, like the <code>arch</code> module, is platform-specific. For |
| example calling <code>is_x86_feature_detected!("avx2")</code> on ARM will be a |
| compile time error. To ensure we don't hit this error a statement level |
| <code>#[cfg]</code> is used to only compile usage of the macro on <code>x86</code>/<code>x86_64</code>.</p> |
| </li> |
| <li> |
| <p>Next up we see our AVX2-enabled function, <code>foo_avx2</code>. This function is |
| decorated with the <code>#[target_feature]</code> attribute which enables a CPU |
| feature for just this one function. Using a compiler flag like <code>-C target-feature=+avx2</code> will enable AVX2 for the entire program, but using |
| an attribute will only enable it for the one function. Usage of the |
| <code>#[target_feature]</code> attribute currently requires the function to also be |
| <code>unsafe</code>, as we see here. This is because the function can only be |
| correctly called on systems which have the AVX2 (like the intrinsics |
| themselves).</p> |
| </li> |
| </ul> |
| <p>And with all that we should have a working program! This program will run |
| across all machines and it'll use the optimized AVX2 implementation on |
| machines where support is detected.</p> |
| <h1 id="ergonomics" class="section-header"><a href="#ergonomics">Ergonomics</a></h1> |
| <p>It's important to note that using the <code>arch</code> module is not the easiest |
| thing in the world, so if you're curious to try it out you may want to |
| brace yourself for some wordiness!</p> |
| <p>The primary purpose of this module is to enable stable crates on crates.io |
| to build up much more ergonomic abstractions which end up using SIMD under |
| the hood. Over time these abstractions may also move into the standard |
| library itself, but for now this module is tasked with providing the bare |
| minimum necessary to use vendor intrinsics on stable Rust.</p> |
| <h1 id="other-architectures" class="section-header"><a href="#other-architectures">Other architectures</a></h1> |
| <p>This documentation is only for one particular architecture, you can find |
| others at:</p> |
| <ul> |
| <li><a href="x86/index.html"><code>x86</code></a></li> |
| <li><a href="x86_64/index.html"><code>x86_64</code></a></li> |
| <li><a href="arm/index.html"><code>arm</code></a></li> |
| <li><a href="aarch64/index.html"><code>aarch64</code></a></li> |
| <li><a href="mips/index.html"><code>mips</code></a></li> |
| <li><a href="mips64/index.html"><code>mips64</code></a></li> |
| <li><a href="powerpc/index.html"><code>powerpc</code></a></li> |
| <li><a href="powerpc64/index.html"><code>powerpc64</code></a></li> |
| <li><a href="nvptx/index.html"><code>nvptx</code></a></li> |
| <li><a href="wasm32/index.html"><code>wasm32</code></a></li> |
| </ul> |
| <h1 id="examples" class="section-header"><a href="#examples">Examples</a></h1> |
| <p>First let's take a look at not actually using any intrinsics but instead |
| using LLVM's auto-vectorization to produce optimized vectorized code for |
| AVX2 and also for the default platform.</p> |
| |
| <div class="example-wrap"><pre class="rust rust-example-rendered"> |
| |
| <span class="kw">fn</span> <span class="ident">main</span>() { |
| <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">dst</span> <span class="op">=</span> [<span class="number">0</span>]; |
| <span class="ident">add_quickly</span>(<span class="kw-2">&</span>[<span class="number">1</span>], <span class="kw-2">&</span>[<span class="number">2</span>], <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">dst</span>); |
| <span class="macro">assert_eq</span><span class="macro">!</span>(<span class="ident">dst</span>[<span class="number">0</span>], <span class="number">3</span>); |
| } |
| |
| <span class="kw">fn</span> <span class="ident">add_quickly</span>(<span class="ident">a</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">b</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">c</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> [<span class="ident">u8</span>]) { |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>))]</span> |
| { |
| <span class="comment">// Note that this `unsafe` block is safe because we're testing</span> |
| <span class="comment">// that the `avx2` feature is indeed available on our CPU.</span> |
| <span class="kw">if</span> <span class="macro">is_x86_feature_detected</span><span class="macro">!</span>(<span class="string">"avx2"</span>) { |
| <span class="kw">return</span> <span class="kw">unsafe</span> { <span class="ident">add_quickly_avx2</span>(<span class="ident">a</span>, <span class="ident">b</span>, <span class="ident">c</span>) }; |
| } |
| } |
| |
| <span class="ident">add_quickly_fallback</span>(<span class="ident">a</span>, <span class="ident">b</span>, <span class="ident">c</span>) |
| } |
| |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>))]</span> |
| <span class="attribute">#[<span class="ident">target_feature</span>(<span class="ident">enable</span> <span class="op">=</span> <span class="string">"avx2"</span>)]</span> |
| <span class="kw">unsafe</span> <span class="kw">fn</span> <span class="ident">add_quickly_avx2</span>(<span class="ident">a</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">b</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">c</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> [<span class="ident">u8</span>]) { |
| <span class="ident">add_quickly_fallback</span>(<span class="ident">a</span>, <span class="ident">b</span>, <span class="ident">c</span>) <span class="comment">// the function below is inlined here</span> |
| } |
| |
| <span class="kw">fn</span> <span class="ident">add_quickly_fallback</span>(<span class="ident">a</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">b</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">c</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> [<span class="ident">u8</span>]) { |
| <span class="kw">for</span> ((<span class="ident">a</span>, <span class="ident">b</span>), <span class="ident">c</span>) <span class="kw">in</span> <span class="ident">a</span>.<span class="ident">iter</span>().<span class="ident">zip</span>(<span class="ident">b</span>).<span class="ident">zip</span>(<span class="ident">c</span>) { |
| <span class="kw-2">*</span><span class="ident">c</span> <span class="op">=</span> <span class="kw-2">*</span><span class="ident">a</span> <span class="op">+</span> <span class="kw-2">*</span><span class="ident">b</span>; |
| } |
| }</pre></div> |
| <p>Next up let's take a look at an example of manually using intrinsics. Here |
| we'll be using SSE4.1 features to implement hex encoding.</p> |
| |
| <div class="example-wrap"><pre class="rust rust-example-rendered"> |
| <span class="kw">fn</span> <span class="ident">main</span>() { |
| <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">dst</span> <span class="op">=</span> [<span class="number">0</span>; <span class="number">32</span>]; |
| <span class="ident">hex_encode</span>(<span class="string">b"\x01\x02\x03"</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">dst</span>); |
| <span class="macro">assert_eq</span><span class="macro">!</span>(<span class="kw-2">&</span><span class="ident">dst</span>[..<span class="number">6</span>], <span class="string">b"010203"</span>); |
| |
| <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">src</span> <span class="op">=</span> [<span class="number">0</span>; <span class="number">16</span>]; |
| <span class="kw">for</span> <span class="ident">i</span> <span class="kw">in</span> <span class="number">0</span>..<span class="number">16</span> { |
| <span class="ident">src</span>[<span class="ident">i</span>] <span class="op">=</span> (<span class="ident">i</span> <span class="op">+</span> <span class="number">1</span>) <span class="kw">as</span> <span class="ident">u8</span>; |
| } |
| <span class="ident">hex_encode</span>(<span class="kw-2">&</span><span class="ident">src</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">dst</span>); |
| <span class="macro">assert_eq</span><span class="macro">!</span>(<span class="kw-2">&</span><span class="ident">dst</span>, <span class="string">b"0102030405060708090a0b0c0d0e0f10"</span>); |
| } |
| |
| <span class="kw">pub</span> <span class="kw">fn</span> <span class="ident">hex_encode</span>(<span class="ident">src</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">dst</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> [<span class="ident">u8</span>]) { |
| <span class="kw">let</span> <span class="ident">len</span> <span class="op">=</span> <span class="ident">src</span>.<span class="ident">len</span>().<span class="ident">checked_mul</span>(<span class="number">2</span>).<span class="ident">unwrap</span>(); |
| <span class="macro">assert</span><span class="macro">!</span>(<span class="ident">dst</span>.<span class="ident">len</span>() <span class="op">>=</span> <span class="ident">len</span>); |
| |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>))]</span> |
| { |
| <span class="kw">if</span> <span class="macro">is_x86_feature_detected</span><span class="macro">!</span>(<span class="string">"sse4.1"</span>) { |
| <span class="kw">return</span> <span class="kw">unsafe</span> { <span class="ident">hex_encode_sse41</span>(<span class="ident">src</span>, <span class="ident">dst</span>) }; |
| } |
| } |
| |
| <span class="ident">hex_encode_fallback</span>(<span class="ident">src</span>, <span class="ident">dst</span>) |
| } |
| |
| <span class="comment">// translated from</span> |
| <span class="comment">// https://github.com/Matherunner/bin2hex-sse/blob/master/base16_sse4.cpp</span> |
| <span class="attribute">#[<span class="ident">target_feature</span>(<span class="ident">enable</span> <span class="op">=</span> <span class="string">"sse4.1"</span>)]</span> |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">any</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>, <span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>))]</span> |
| <span class="kw">unsafe</span> <span class="kw">fn</span> <span class="ident">hex_encode_sse41</span>(<span class="kw-2">mut</span> <span class="ident">src</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">dst</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> [<span class="ident">u8</span>]) { |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86"</span>)]</span> |
| <span class="kw">use</span> <span class="ident">std</span>::<span class="ident">arch</span>::<span class="ident">x86</span>::<span class="kw-2">*</span>; |
| <span class="attribute">#[<span class="ident">cfg</span>(<span class="ident">target_arch</span> <span class="op">=</span> <span class="string">"x86_64"</span>)]</span> |
| <span class="kw">use</span> <span class="ident">std</span>::<span class="ident">arch</span>::<span class="ident">x86_64</span>::<span class="kw-2">*</span>; |
| |
| <span class="kw">let</span> <span class="ident">ascii_zero</span> <span class="op">=</span> <span class="ident">_mm_set1_epi8</span>(<span class="string">b'0'</span> <span class="kw">as</span> <span class="ident">i8</span>); |
| <span class="kw">let</span> <span class="ident">nines</span> <span class="op">=</span> <span class="ident">_mm_set1_epi8</span>(<span class="number">9</span>); |
| <span class="kw">let</span> <span class="ident">ascii_a</span> <span class="op">=</span> <span class="ident">_mm_set1_epi8</span>((<span class="string">b'a'</span> <span class="op">-</span> <span class="number">9</span> <span class="op">-</span> <span class="number">1</span>) <span class="kw">as</span> <span class="ident">i8</span>); |
| <span class="kw">let</span> <span class="ident">and4bits</span> <span class="op">=</span> <span class="ident">_mm_set1_epi8</span>(<span class="number">0xf</span>); |
| |
| <span class="kw">let</span> <span class="kw-2">mut</span> <span class="ident">i</span> <span class="op">=</span> <span class="number">0_isize</span>; |
| <span class="kw">while</span> <span class="ident">src</span>.<span class="ident">len</span>() <span class="op">>=</span> <span class="number">16</span> { |
| <span class="kw">let</span> <span class="ident">invec</span> <span class="op">=</span> <span class="ident">_mm_loadu_si128</span>(<span class="ident">src</span>.<span class="ident">as_ptr</span>() <span class="kw">as</span> <span class="kw-2">*</span><span class="kw">const</span> <span class="kw">_</span>); |
| |
| <span class="kw">let</span> <span class="ident">masked1</span> <span class="op">=</span> <span class="ident">_mm_and_si128</span>(<span class="ident">invec</span>, <span class="ident">and4bits</span>); |
| <span class="kw">let</span> <span class="ident">masked2</span> <span class="op">=</span> <span class="ident">_mm_and_si128</span>(<span class="ident">_mm_srli_epi64</span>(<span class="ident">invec</span>, <span class="number">4</span>), <span class="ident">and4bits</span>); |
| |
| <span class="comment">// return 0xff corresponding to the elements > 9, or 0x00 otherwise</span> |
| <span class="kw">let</span> <span class="ident">cmpmask1</span> <span class="op">=</span> <span class="ident">_mm_cmpgt_epi8</span>(<span class="ident">masked1</span>, <span class="ident">nines</span>); |
| <span class="kw">let</span> <span class="ident">cmpmask2</span> <span class="op">=</span> <span class="ident">_mm_cmpgt_epi8</span>(<span class="ident">masked2</span>, <span class="ident">nines</span>); |
| |
| <span class="comment">// add '0' or the offset depending on the masks</span> |
| <span class="kw">let</span> <span class="ident">masked1</span> <span class="op">=</span> <span class="ident">_mm_add_epi8</span>( |
| <span class="ident">masked1</span>, |
| <span class="ident">_mm_blendv_epi8</span>(<span class="ident">ascii_zero</span>, <span class="ident">ascii_a</span>, <span class="ident">cmpmask1</span>), |
| ); |
| <span class="kw">let</span> <span class="ident">masked2</span> <span class="op">=</span> <span class="ident">_mm_add_epi8</span>( |
| <span class="ident">masked2</span>, |
| <span class="ident">_mm_blendv_epi8</span>(<span class="ident">ascii_zero</span>, <span class="ident">ascii_a</span>, <span class="ident">cmpmask2</span>), |
| ); |
| |
| <span class="comment">// interleave masked1 and masked2 bytes</span> |
| <span class="kw">let</span> <span class="ident">res1</span> <span class="op">=</span> <span class="ident">_mm_unpacklo_epi8</span>(<span class="ident">masked2</span>, <span class="ident">masked1</span>); |
| <span class="kw">let</span> <span class="ident">res2</span> <span class="op">=</span> <span class="ident">_mm_unpackhi_epi8</span>(<span class="ident">masked2</span>, <span class="ident">masked1</span>); |
| |
| <span class="ident">_mm_storeu_si128</span>(<span class="ident">dst</span>.<span class="ident">as_mut_ptr</span>().<span class="ident">offset</span>(<span class="ident">i</span> <span class="op">*</span> <span class="number">2</span>) <span class="kw">as</span> <span class="kw-2">*</span><span class="kw-2">mut</span> <span class="kw">_</span>, <span class="ident">res1</span>); |
| <span class="ident">_mm_storeu_si128</span>( |
| <span class="ident">dst</span>.<span class="ident">as_mut_ptr</span>().<span class="ident">offset</span>(<span class="ident">i</span> <span class="op">*</span> <span class="number">2</span> <span class="op">+</span> <span class="number">16</span>) <span class="kw">as</span> <span class="kw-2">*</span><span class="kw-2">mut</span> <span class="kw">_</span>, |
| <span class="ident">res2</span>, |
| ); |
| <span class="ident">src</span> <span class="op">=</span> <span class="kw-2">&</span><span class="ident">src</span>[<span class="number">16</span>..]; |
| <span class="ident">i</span> <span class="op">+=</span> <span class="number">16</span>; |
| } |
| |
| <span class="kw">let</span> <span class="ident">i</span> <span class="op">=</span> <span class="ident">i</span> <span class="kw">as</span> <span class="ident">usize</span>; |
| <span class="ident">hex_encode_fallback</span>(<span class="ident">src</span>, <span class="kw-2">&</span><span class="kw-2">mut</span> <span class="ident">dst</span>[<span class="ident">i</span> <span class="op">*</span> <span class="number">2</span>..]); |
| } |
| |
| <span class="kw">fn</span> <span class="ident">hex_encode_fallback</span>(<span class="ident">src</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>], <span class="ident">dst</span>: <span class="kw-2">&</span><span class="kw-2">mut</span> [<span class="ident">u8</span>]) { |
| <span class="kw">fn</span> <span class="ident">hex</span>(<span class="ident">byte</span>: <span class="ident">u8</span>) <span class="op">-></span> <span class="ident">u8</span> { |
| <span class="kw">static</span> <span class="ident">TABLE</span>: <span class="kw-2">&</span>[<span class="ident">u8</span>] <span class="op">=</span> <span class="string">b"0123456789abcdef"</span>; |
| <span class="ident">TABLE</span>[<span class="ident">byte</span> <span class="kw">as</span> <span class="ident">usize</span>] |
| } |
| |
| <span class="kw">for</span> (<span class="ident">byte</span>, <span class="ident">slots</span>) <span class="kw">in</span> <span class="ident">src</span>.<span class="ident">iter</span>().<span class="ident">zip</span>(<span class="ident">dst</span>.<span class="ident">chunks_mut</span>(<span class="number">2</span>)) { |
| <span class="ident">slots</span>[<span class="number">0</span>] <span class="op">=</span> <span class="ident">hex</span>((<span class="kw-2">*</span><span class="ident">byte</span> <span class="op">>></span> <span class="number">4</span>) <span class="op">&</span> <span class="number">0xf</span>); |
| <span class="ident">slots</span>[<span class="number">1</span>] <span class="op">=</span> <span class="ident">hex</span>(<span class="kw-2">*</span><span class="ident">byte</span> <span class="op">&</span> <span class="number">0xf</span>); |
| } |
| }</pre></div> |
| </div><h2 id='modules' class='section-header'><a href="#modules">Modules</a></h2> |
| <table><tr class='unstable module-item'><td><a class="mod" href="arm/index.html" title='core_arch::arm mod'>arm</a></td><td class='docblock-short'><span class="stab unstable">Experimental</span><span class="stab portability">ARM</span><p>Platform-specific intrinsics for the <code>arm</code> platform.</p> |
| </td></tr></table></section><section id="search" class="content hidden"></section><section class="footer"></section><aside id="help" class="hidden"><div><h1 class="hidden">Help</h1><div class="shortcuts"><h2>Keyboard Shortcuts</h2><dl><dt><kbd>?</kbd></dt><dd>Show this help dialog</dd><dt><kbd>S</kbd></dt><dd>Focus the search field</dd><dt><kbd>↑</kbd></dt><dd>Move up in search results</dd><dt><kbd>↓</kbd></dt><dd>Move down in search results</dd><dt><kbd>↹</kbd></dt><dd>Switch tab</dd><dt><kbd>⏎</kbd></dt><dd>Go to active search result</dd><dt><kbd>+</kbd></dt><dd>Expand all sections</dd><dt><kbd>-</kbd></dt><dd>Collapse all sections</dd></dl></div><div class="infos"><h2>Search Tricks</h2><p>Prefix searches with a type followed by a colon (e.g., <code>fn:</code>) to restrict the search to a given type.</p><p>Accepted types are: <code>fn</code>, <code>mod</code>, <code>struct</code>, <code>enum</code>, <code>trait</code>, <code>type</code>, <code>macro</code>, and <code>const</code>.</p><p>Search functions by type signature (e.g., <code>vec -> usize</code> or <code>* -> vec</code>)</p><p>Search multiple things at once by splitting your query with comma (e.g., <code>str,u8</code> or <code>String,struct:Vec,test</code>)</p></div></div></aside><script>window.rootPath = "../";window.currentCrate = "core_arch";</script><script src="../aliases.js"></script><script src="../main.js"></script><script defer src="../search-index.js"></script></body></html> |